Convert word-files to txt files - vba

I have a lot of folders. Each folder has one word-document.
I now want to convert each word-document to a txt-document, save all txt-document in one separate folder and give them the name of the folder (of the word-document).
I actually have no idea about how to start. I have some PHP-knowledge but actually never created a script/macro like this case - and that's the reason why I even don't know how to google for an easy solution the right way to get further information.
Of course, you don't need to present me the whole script, but I would really appreciate to get some information about how to start and where I could get further information.
Thank you so much!

If you need to convert multiple Word files to other formats, like TXT, RTF, HTML or PDF, run the script below.
Option Explicit On
Sub ChangeDocsToTxtOrRTFOrHTML()
'with export to PDF in Word 2007
Dim fs As Object
Dim oFolder As Object
Dim tFolder As Object
Dim oFile As Object
Dim strDocName As String
Dim intPos As Integer
Dim locFolder As String
Dim fileType As String
On Error Resume Next
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\Users\your_path_here\")
Select Case Application.Version
Case Is < 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
Case Is >= 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
End Select
Application.ScreenUpdating = False
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.CreateFolder(locFolder & "Converted")
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
strDocName = ActiveDocument.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
ChangeFileOpenDirectory tFolder
Select Case fileType
Case Is = "TXT"
strDocName = strDocName & ".txt"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
Case Is = "RTF"
strDocName = strDocName & ".rtf"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
Case Is = "HTML"
strDocName = strDocName & ".html"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
Case Is = "PDF"
strDocName = strDocName & ".pdf"
ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
End Select
d.Close
ChangeFileOpenDirectory oFolder
Next oFile
Application.ScreenUpdating = True
End Sub

Related

How to convert multiple word documents from .doc to .docx?

I have many .doc documents located in many subfolders and I would like to covert them to .docx
I was opening each file and saving it but there are too many of them, so I thought there must be a better and a faster way. I found online some VBA code but none seem to work.
First VBA code:
Sub TranslateDocIntoDocx()
Dim objWordApplication As New Word.Application
Dim objWordDocument As Word.Document
Dim strFile As String
Dim strFolder As String
strFolder = "H:\Vanhuspalvelut\Kotihoito\Tammelan_kotihoito\TURVALLISUUS\Pelastussuunnitelmaan_tuleva\TURVALLISUUS_SUUNNITELMA_2015"
strFile = Dir(strFolder & "*.doc", vbNormal)
While strFile <> ""
With objWordApplication
Set objWordDocument = .Documents.Open(FileName:=strFolder & strFile, AddToRecentFiles:=False, ReadOnly:=True, Visible:=False)
With objWordDocument
.SaveAs FileName:=strFolder & Replace(strFile, "doc", "docx"), FileFormat:=16
.Close
End With
End With
strFile = Dir()
Wend
Set objWordDocument = Nothing
Set objWordApplication = Nothing
End Sub
Second VBA code:
Sub ConvertBatchToDOCX()
Dim sSourcePath As String
Dim sTargetPath As String
Dim sDocName As String
Dim docCurDoc As Document
Dim sNewDocName As String
' Looking in this path
sSourcePath = "H:\Vanhuspalvelut\Kotihoito\Tammelan_kotihoito\TURVALLISUUS\Pelastussuunnitelmaan_tuleva\TURVALLISUUS_SUUNNITELMA_2015"
sTargetPath = "H:\Vanhuspalvelut\Kotihoito\Tammelan_kotihoito\TURVALLISUUS\Pelastussuunnitelmaan_tuleva\TURVALLISUUS_SUUNNITELMA_2015"
' Look for first DOC file
sDocName = Dir(sSourcePath & "*.doc")
Do While sDocName <> ""
' Repeat as long as there are source files
'Only work on files where right-most characters are ".doc"
If Right(sDocName, 4) = ".doc" Then
' Open file
Set docCurDoc = Documents.Open(FileName:=sSourcePath & sDocName)
sNewDocName = Replace(sDocName, ".doc", ".docx")
With docCurDoc
.SaveAs FileName:=sTargetPath & sNewDocName, _
FileFormat:=wdFormatDocumentDefault
.Close SaveChanges:=wdDoNotSaveChanges
End With
End If
' Get next source file name
sDocName = Dir
Loop
MsgBox "Finished"
End Sub
Any help would be much appreciated!
In both routines you have the same small mistake: You miss a Backslash between the path and the filename. Your Dir-Command will see the following command and therefore doesn't find anything:
Dir("H:\Vanhuspalvelut\Kotihoito\Tammelan_kotihoito\TURVALLISUUS\Pelastussuunnitelmaan_tuleva\TURVALLISUUS_SUUNNITELMA_2015*.doc", vbNormal
Either add the backslash at the end of the path definition:
strFolder = "H:\Vanhuspalvelut\Kotihoito\Tammelan_kotihoito\TURVALLISUUS\Pelastussuunnitelmaan_tuleva\TURVALLISUUS_SUUNNITELMA_2015\"
or put it into the Dir-command:
strFile = Dir(strFolder & "\*.doc", vbNormal)

Save files in designated folders

This is code derived from Mail Merge Tips and Tricks.
Sub Merge_To_Individual_Files()
Application.ScreenUpdating = False
Dim StrFolder As String, StrName As String, MainDoc As Document, i As Long, j As Long
Set MainDoc = ActiveDocument
With MainDoc
StrFolder = .Path & Application.PathSeparator
For i = 1 To .MailMerge.DataSource.RecordCount
With .MailMerge
.Destination = wdSendToNewDocument
.SuppressBlankLines = True
With .DataSource
.FirstRecord = i
.LastRecord = i
.ActiveRecord = i
If Trim(.DataFields("Name")) = "" Then Exit For
StrName = .DataFields("Number") & "_" & .DataFields("Name") & "_Test"
End With
.Execute Pause:=False
End With
StrName = Trim(StrName)
With ActiveDocument
.SaveAs2 FileName:=StrFolder & StrName & ".pdf", FileFormat:=wdFormatPDF, AddToRecentFiles:=False
.PrintOut Copies:=1
.Close SaveChanges:=False
End With
Next i
End With
Application.ScreenUpdating = True
End Sub
The code separates a serial letter into individual files, saves them as pdf and starts the printing.
The macro saves all the files in the same folder and I have to move each file to the designated folder manually (each file has an own folder with the "Number" from the code as its name).
Is it possible to save the files directly in the intended folder?
I'd do something like this:
Dim num, numGen as long, f, StrFolder As String
'...
'...
num = .DataFields("Number") 'capture the value in the With .DataSource block
'...
'...
'check if the destination folder exists
f = FindFolder(StrFolder, CStr(num)) 'returns folder path if exists
If Len(f) = 0 Then
'no match was found - use a generic folder
f = StrFolder & "General" 'or whatever you want
numGen = numGen + 1
End If
.SaveAs2 FileName:= f & _
Application.PathSeparator & StrName & ".pdf", _
FileFormat:=wdFormatPDF, AddToRecentFiles:=False
'...
'...
'Notify that some files need to be moved
If numGen > 0 Then
Msgbox numGen & " files were saved to 'General' folder"
End If
This function will return the path of any matched folder given a starting folder to begin in (includes searching in subfolders). Returns empty string if no match.
Function FindFolder(StartAt As String, ByVal folderName As String) As String
Dim colFolders As New Collection, sf, path, fld, fso
Set fso = CreateObject("scripting.filesystemobject")
colFolders.Add StartAt
Do While colFolders.Count > 0
fld = colFolders(1)
colFolders.Remove 1
If Right(fld, 1) <> "\" Then fld = fld & "\"
For Each sf In fso.getfolder(fld).subfolders
If sf.Name = folderName Then
FindFolder = sf.path
Exit Function
Else
colFolders.Add sf
End If
Next sf
Loop
End Function
Your code is derived from the Send Mailmerge Output to Individual Files article in the Mailmerge Tips & Tricks thread, at https://www.msofficeforums.com/mail-merge/21803-mailmerge-tips-tricks.html.
That article contains code for setting the save path and tells you how to use it...

Convert word doc to pdf and send as attachment in Outlook

I would like to convert my Word Doc to a pdf and send it as an attachment as part of my constructed Outlook email.
I have tried adding ,".pdf" at the end of my SaveAs2 line, which changed and attached the file format as pdf, however, when attempting to open the file it does not display and gives me a message that the file did not have all it's code when sent as an attachment.
Private Sub emailbutton_Click()
'No-option email sending
Dim OL As Object
Dim EmailItem As Object
Dim Doc As Document
Application.ScreenUpdating = False
Set OL = CreateObject("Outlook.Application")
Set EmailItem = OL.CreateItem(olMailItem)
Set Doc = ActiveDocument
If VName.Value = "" Then
Doc.SaveAs ("Quotation_Blank 2016")
Else
Doc.SaveAs2 ("QFORM" & "_" & JNumber.Value & "_" & VName.Value)
End If
With EmailItem
.Display
End With
' Signature = EmailItem.body
With EmailItem
.Subject = "QFORM" & "_" & JNumber.Value & "_" & VName.Value
'HTMLbody
msg = "<b><font face=""Times New Roman"" size=""4"" color=""blue"">INTEGRATED ASSEMBLY </font></b><br>" _
& " 1200 Woodruff Rd.<br>" _
& " Suite A12<br>" _
& " Greenville, SC 29607<br><br>" _
& "We have recently released subject project, which will contain assemblies to be outsourced. You have been selected to build these assemblies according to the attachment. <br><br>" _
& "As part of this process, please review the quotation form attached and indicate your acceptance. If adjustments and-or corrections are required, please feel free to contact us for quick resolution. <br><br>" _
& "<b><font face=""Times New Roman"" size=""4"" color=""Red"">NOTE: </font></b>" _
& "The information on attached quotation form is not a contract and only an estimate of predetermined costs per hourly rate for outsource assemblies. <br><br>" _
& "*******For your records you may wish to print out the completed quote form. <br><br>" _
& "Thank you, <br><br>" _
& "<b>HARTNESS INTERNATIONAL </b><br>" _
& "H1 Production Control <br>" _
& vbNewLine & Signature
.HTMLBody = msg & .HTMLBody
If VName.Value = "INTEGRATED ASSEMBLY" Then
.To = "Email1.com;"
.CC = "Email2.com;" & "Email3.com;"
.Importance = olImportanceNormal 'Or olImportanceHigh Or olImportanceLow
.Attachments.Add Doc.FullName
.Display
ElseIf VName.Value = "LEWALLEN" Then
.To = "Email1.com;"
.CC = "Email2.com;" & "Email3.com;"
.Importance = olImportanceNormal 'Or olImportanceHigh Or olImportanceLow
.Attachments.Add Doc.FullName
.Display
End If
End With
Application.ScreenUpdating = True
Set Doc = Nothing
Set OL = Nothing
Set EmailItem = Nothing
End Sub
With SaveAs2 you can specify FileFormat
https://msdn.microsoft.com/en-us/library/office/ff836084.aspx
expression .SaveAs2(FileName, FileFormat, LockComments, Password, AddToRecentFiles, WritePassword, ReadOnlyRecommended, EmbedTrueTypeFonts, SaveNativePictureFormat, SaveFormsData, SaveAsAOCELetter, Encoding, InsertLineBreaks, AllowSubstitutions, LineEnding, AddBiDiMarks, CompatibilityMode)
https://msdn.microsoft.com/en-us/library/office/ff839952.aspx
FileFormat is wdFormatPDF or 17
Change your saveAs2 this way.
If VName.Value = "" Then
Doc.SaveAs ("Quotation_Blank 2016")
Else
Doc.ExportAsFixedFormat OutputFileName:="QFORM" & "_" & JNumber.Value , _
ExportFormat:=wdExportFormatPDF
End If
Edit
To use path & add is as attachment
If VName.Value = "" Then
Doc.SaveAs ("Quotation_Blank 2016")
Else
Path = "C:\Temp\"
FileName = "QFORM" & "_" & JNumber.Value & "_" & VName.Value
Doc.ExportAsFixedFormat OutputFileName:=Path & FileName, _
ExportFormat:=wdExportFormatPDF
End If
And Attahcment
.Attachments.Add Path & FileName & ".pdf"
If you need to convert multiple Word files to other formats, like TXT, RTF, HTML or PDF, run the script below.
Option Explicit On
Sub ChangeDocsToTxtOrRTFOrHTML()
'with export to PDF in Word 2007
Dim fs As Object
Dim oFolder As Object
Dim tFolder As Object
Dim oFile As Object
Dim strDocName As String
Dim intPos As Integer
Dim locFolder As String
Dim fileType As String
On Error Resume Next
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\Users\your_path_here\")
Select Case Application.Version
Case Is < 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
Case Is >= 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
End Select
Application.ScreenUpdating = False
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.CreateFolder(locFolder & "Converted")
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
strDocName = ActiveDocument.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
ChangeFileOpenDirectory tFolder
Select Case fileType
Case Is = "TXT"
strDocName = strDocName & ".txt"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
Case Is = "RTF"
strDocName = strDocName & ".rtf"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
Case Is = "HTML"
strDocName = strDocName & ".html"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
Case Is = "PDF"
strDocName = strDocName & ".pdf"
ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
End Select
d.Close
ChangeFileOpenDirectory oFolder
Next oFile
Application.ScreenUpdating = True
End Sub
The results are saved in a folder that is dynamically created and in the same folder that contains the documents that you just converted.
I was wondering if you could post all your code for this solution. I have been looking for something like this for a while and all my experience is on powershell. I know this i generally frowned upon but i am running out of options

Create a vba code to replace all the headers, of all the word documents in a Folder and Subfolders

Sub ReplaceEntireHdr()
Dim wrd As Word.Application
Set wrd = CreateObject("word.application")
wrd.Visible = True
AppActivate wrd.Name
'Change the directory to YOUR folder's path
fName = Dir("C:\Users\user1\Desktop\A\*.doc")
Do While (fName <> "")
With wrd
'Change the directory to YOUR folder's path
.Documents.Open ("C:\Users\user1\Desktop\A\" & fName)
If .ActiveWindow.View.SplitSpecial = wdPaneNone Then
.ActiveWindow.ActivePane.View.Type = wdPrintView
Else
.ActiveWindow.View.Type = wdPrintView
End If
.ActiveWindow.ActivePane.View.SeekView = wdSeekCurrentPageHeader
.Selection.WholeStory
.Selection.Paste
.ActiveDocument.Save
.ActiveDocument.Close
End With
fName = Dir
Loop
Set wrd = Nothing
End Sub
I use this vba code to replace all the headers, of all the word documents in a folder 'A'. However if there is any subfolder in the parent folder 'A' with word documents, the vba code skips those documents. Could anyone please tell me how to include the word documents in the subfolders as well? Perhaps by making some changes in the code or any other vba code which can do the same job.
Thanks in advance.
In order to pick up the folders (directories) you need to specify the vbDirectory attribute. By default, Dir only "sees" things that match vbNormal.
Here's an example that picks up both files and sub-directories. The GetAttr function checks whether the file attribute is vbDirectory. If it's not, then it's a file.
What you can do is save the directory paths in an array, then loop that to get the files in the sub-directories.
Sub GetFilesandSubDir()
Dim sPath As String, sPattern As String
Dim sSearch As String, sFile As String
Dim sPathSub As String, sSearchSub As String
Dim aSubDirs As Variant, i As Long
sPattern = "*.*"
sPath = "C:\Test\"
sSearch = sPath & sPattern
sFile = Dir(sPath, vbNormal + vbDirectory)
aSubDirs = TestDirWithSubFolders(sPath, sPattern, sSearch, sFile)
For i = LBound(aSubDirs) To UBound(aSubDirs)
Debug.Print "Directory: " & aSubDirs(i)
sPathSub = sPath & aSubDirs(i) & "\"
sSearchSub = sPathSub & sPattern
sFile = Dir(sPathSub, vbNormal + vbDirectory)
TestDirWithSubFolders sPathSub, sPattern, sSearchSub, sFile
Next
End Sub
Function TestDirWithSubFolders(sPath As String, sPattern As String, _
sSearch As String, sFile As String) As Variant
Dim aSubDirs() As Variant, i As Long
i = 0
Do While sFile <> ""
If GetAttr(sPath & sFile) = vbDirectory Then
'Debug.Print "Directory: " & sFile
ReDim Preserve aSubDirs(i)
aSubDirs(i) = sFile
i = i + 1
Else
Debug.Print "File: " & sFile
End If
sFile = Dir
Loop
TestDirWithSubFolders = aSubDirs
End Function

MS Word VBA docx to txt message trap code

[Using MS Word 2010]
I have a macro that converts a Word document from docx format to txt format. However there are some documents that stop during conversion with the following notification: "The document may contain text content that will be lost upon conversion to the chosen encoding. To preserve this content , click No to exit this dialog box,and then choose another encoding that supports the languages in this document.
Do you want to continue saving the document? Yes/No"
Of course I want to continue to save the document as txt so I have to click the Yes button. I am converting hundreds of documents (docx) to text and want the macro to intercept this message and tell Word 'Yes" without my intervention. Can anyone tell me the VBA code needed to accomplish this and where in my macro does it need to go? I am a complete novice to VBA. I found the following VBA code on the Internet and it works fine except for what I just indicated about the Word message. Here is the code:
Sub ChangDocsToTxtOrRTForHTML()
'with export to PDF in word 2007
Dim fs As Object
Dim oFolder As Object
Dim tFolder As Object
Dim oFile As Object
Dim strDocName As String
Dim intPos As Integer
Dim locFolder As String
Dim fileType As String
On Error Resume Next
locFolder = InputBox("Enter the folder path do DOCs", "File Conversion", "C:\myDocs")
Select Case Application.Version
Case Is < 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
Case Is >= 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML, or PDF(2007+ only)", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
End Select
Application.ScreenUpdating = False
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.CreateFolder(locFolder & "Converted")
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
strDocName = ActiveDocument.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
ChangeFileOpenDirectory tFolder
Select Case fileType
Case Is = "TXT"
strDocName = strDocName & ".txt"
ActiveDocument.SaveAs FileName:=strDocName, Fileformat:=wdFormatText
Case Is = "RTF"
strDocName = strDocName & ".rtf"
ActiveDocument.SaveAs FileName:=strDocName, Fileformat:=wdFormatRTF
Case Is = "HTML"
strDocName = strDocName & ".html"
ActiveDocument.SaveAs FileName:=strDocName, Fileformat:=wdFormatHTML
Case Is = "PDF"
strDocName = strDocName & ".pdf"
'*** Word 2007 users - remove the apostrophe at the start of the next line ***
'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
End Select
d.Close
ChangeFileOpenDirectory oFolder
Next oFile
Application.ScreenUpdating = True
End Sub
Sub ConvertFiles()
Application.ScreenUpdating = False
Dim strFolder As String, strFile As String, wdDoc As Document
strFolder = GetFolder
If strFolder = "" Then Exit Sub
strFile = Dir(strFolder & "\*.txt", vbNormal)
While strFile <> ""
Set wdDoc = Documents.Open(FileName:=strFolder & "\" & strFile, _
Format:=wdOpenFormatEncodedText, Encoding:=msoEncodingUTF8, _
AddToRecentFiles:=False, Visible:=False)
wdDoc.SaveAs2 FileName:=strFolder & "\" & Replace(strFile, ".txt", ".docx"), _
Fileformat:=wdFormatXMLDocument, AddToRecentFiles:=False
wdDoc.Close SaveChanges:=False
strFile = Dir()
Wend
Set wdDoc = Nothing
Application.ScreenUpdating = True
End Sub
Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function
Thank you for your help.
ntman2
For text files, Word is able to use different encoding schemes for saved-as files that can avoid generating this warnings. Try:
ActiveDocument.SaveAs _
FileName:=strDocName, _
Fileformat:=wdFormatText, _
Encoding:=msoEncodingUnicodeLittleEndian
instead of:
ActiveDocument.SaveAs FileName:=strDocName, Fileformat:=wdFormatText