Search across a large number of files and extract certain rows into a separate single file based on search string - vba

I am new to working with Macro's & VBA in Microsoft Word and also "programming".
I have over 100 separate Microsoft Word files with the name structure "ABC - XXXX.docx". They range from "ABC - 1800.docx" to ABC - 2020.docx"
Within each of these files is a single large table (with a variable, but large, number of rows).
I would like to be able to search across all these files at one time (in a batch) to find and extract (but not delete) the rows that contain a specific string - for example "Date Needed". These rows should be put into a new file called "XYZ- Exceptions.docx".
I would like to be able copy the whole row (which consists of 10 columns) and retain the formatting of the data within the columns. The search string can be anywhere within the row and within any column.
If this "extract" file could then be sorted on up to three columns that would be even better.
I also have a separate need to extract all the rows from all the files with a blank in the second column.
A little guidance on what code structures, syntax and functionality to use to process this in simple terms would be very helpful.
Thank you.

You have two projects in your post. One project per thread, please.
For the first project:
Sub Demo()
Application.ScreenUpdating = False
Dim strFolder As String, strFile As String
Dim wdDocTgt As Document, wdDocSrc As Document
strFolder = GetFolder: If strFolder = "" Then Exit Sub
Set wdDocTgt = Documents.Add
'Find all files whose names begine with "ABC - "
strFile = Dir(strFolder & "\ABC - *.doc", vbNormal)
While strFile <> ""
Set wdDocSrc = Documents.Open(FileName:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
With wdDocSrc.Range.Tables(1).Range
'Find all rows containing "Date Needed"
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "Date Needed"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
End With
Do While .Find.Execute
If .Information(wdWithInTable) = True Then
'Replicate the row in the output document
wdDocTgt.Range.Characters.Last.FormattedText = .Duplicate.Rows(1).Range.FormattedText
Else
Exit Do
End If
.Collapse wdCollapseEnd
Loop
End With
wdDocSrc.Close SaveChanges:=False
strFile = Dir()
Wend
With wdDocTgt
'Sort the table
.Tables(1).Sort ExcludeHeader:=False, CaseSensitive:=False, _
FieldNumber:="Column 1", SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="Column 2", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:=wdSortOrderAscending, _
FieldNumber3:="Column 3", SortFieldType3:=wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending
'Save the output document
.SaveAs2 FileName:=StrPth & StrFld & "\XYZ - Exceptions.docx", Fileformat:=wdFormatXMLDocument, AddToRecentFiles:=False
End With
Set wdDocSrc = Nothing: Set wdDocTgt = Nothing
Application.ScreenUpdating = True
End Sub
Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function

Related

Append Word docx files while keeping their format in VBA

I am creating a Word Macro that receives two arguments: a list of docx documents and the name of the new file. The goal is that the Macro inserts one document after the other, preserving their respective format, and saves as a new docx document.
Sub Merger(path As String, args () As Variant)
Dim vArg As Variant
Active Document.Select
Selection.ClearFormatting
For Each vArg In args
ActiveDocument.Content.Words.Last.Select
Selection.InsertFile:= _ vArg _,Range:="", _ConfirmConversions:= False, Link:=False, Attachment:= False )
Selection.InsertBreak Type:=wdPageBreak
Next vArg
ActiveDocument.SaveAs2 File Name=path
ActiveDocument.Close
Application.Quit
Note that I call the Macro from an empty docx file.
The problem is that neither the header nor the format of the orginal files are preserved in the new docx document.
The Word format is not modular. Instead, consider creating a Master Document, then filling it with subdocuments. Here's code to create a master document from a folder full of subdocuments:
Sub AssembleMasterDoc()
Dim SubDocFile$, FolderPath$, Template$
Dim Counter&
Dim oFolder As FileDialog
Dim oBookmark As Bookmark
Dim oTOC As TableOfContents
'Create a dynamic array variable, and then declare its initial size
Dim DirectoryListArray() As String
ReDim DirectoryListArray(1000)
Template$ = ActiveDocument.AttachedTemplate.Path & Application.PathSeparator & ActiveDocument.AttachedTemplate.Name
'Loop through all the files in the directory by using Dir$ function
Set oFolder = Application.FileDialog(msoFileDialogFolderPicker)
With oFolder
.AllowMultiSelect = False
If .Show <> 0 Then
FolderPath$ = .SelectedItems(1)
Else
GoTo EndSub
End If
End With
Application.ScreenUpdating = False
SubDocFile$ = Dir$(FolderPath$ & Application.PathSeparator & "*.*")
Do While SubDocFile$ <> ""
DirectoryListArray(Counter) = SubDocFile$
SubDocFile$ = Dir$
Counter& = Counter& + 1
Loop
'Reset the size of the array without losing its values by using Redim Preserve
ReDim Preserve DirectoryListArray(Counter& - 1)
WordBasic.SortArray DirectoryListArray()
ActiveWindow.ActivePane.View.Type = wdOutlineView
ActiveWindow.View = wdMasterView
Selection.EndKey Unit:=wdStory
For x = 0 To (Counter& - 1)
If IsNumeric(Left(DirectoryListArray(x), 1)) Then
FullName$ = FolderPath$ & Application.PathSeparator & DirectoryListArray(x)
Documents.Open FileName:=FullName$, ConfirmConversions:=False
With Documents(FullName$)
.AttachedTemplate = Template$
For Each oBookmark In Documents(FullName$).Bookmarks
oBookmark.Delete
Next oBookmark
.Close SaveChanges:=True
End With
Selection.Range.Subdocuments.AddFromFile Name:=FullName$, ConfirmConversions:=False
End If
Next x
For Each oTOC In ActiveDocument.TablesOfContents
oTOC.Update
Next oTOC
ActiveWindow.ActivePane.View.Type = wdPrintView
Application.ScreenUpdating = True
EndSub:
End Sub
This code is from a previous project, so you may not need all of it, like the update of multiple TOCs.
Don't attempt to maintain and edit Master Documents. The format is prone to corruption. Instead, assemble a master document for printing (or other use), then discard it.

Select some parts of text from one Word document and copy into another Word document

I have a word file with some spaces, for example:
Word File XXXXX
Title: XXXXX
etc
And I have another word file which have that data that is missing:
Word File 20248
Title: Example of word file
etc
My question is, how can I use vba to recognize the data from the first file to be copied into the second file in the spaces I want. Furthermore I'd prefer that you can select the word file you want with a dialog box rather than putting in the code where the file is located as I have different files that can have the location changed.
Thank you so much for your answers. I'm pretty new in vba and I have never used it on word.
By now I have this code to choose the word file from which I want to copy the data:
Sub CopyData()
Dim DC As Document
Dim wD As Document, strD As String, wDNumb As Variant
Dim I As Long
Set wD = ActiveDocument
DSelection:
For I = 1 To Documents.Count
strD = strD & Documents(I).Name & " - " & I & vbCrLf
Next I
wDNumb = InputBox("Please, choose the number of the word file from which you are choosing the data to copy:" & vbCrLf & _
vbCrLf & strD, "Choose the word document from which you are copying the data!", 1)
If wDNumb <= Documents.Count And wDNumb >= 1 Then
GoTo DSelection2
ElseIf wDNumb = "" Then MsgBox "Operation cancelled", vbCritical, "Cancelled"
Exit Sub
ElseIf wDNumb > Documents.Count Or wDNumb < 1 Then MsgBox "Wrong number, input a correct number", vbExclamation, "Wrong number"
Exit Sub
End If
DSelection2:
If IsNumeric(wDNumb) Then
Set DC = Documents(CLng(wDNumb))
Else
MsgBox "Please choose the number on the right of the document chosen!": GoTo DSelection
End If
End Sub
I have the following part of the code to copy some part of the Word to the other using bookmarks:
DC.Activate
Set Rng = DC.Range
With Rng.Find
.ClearFormatting
.Execute FindText:="TITLE:", Forward:=True, _
Format:=False, Wrap:=wdFindStop
Fnd = .Found
End With
If Fnd = True Then
With Rng
.MoveStart wdCharacter, 10
.MoveEnd wdSentence, 1
End With
End If
Rng.Select
Selection.Copy
wD.Activate
Selection.GoTo What:=wdGoToBookmark, Name:="TITLE"
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
Selection.Paste
There are multiple possible ways of approaching this, but your problem description lacks sufficient detail. For example, one could insert:
bookmarks;
content controls;
Section breaks;
tables;
etc.,
into the target document so that content from the source document can be inserted there.
Alternatively, one might use Find/Replace to locate a predefined string that can be replaced with the desired content.
With your updated problem description, you might use:
Dim RngDC As Range, wDRng As Range, BkMkNm As String
BkMkNm "TITLE"
With DC
With .Range.Find
.ClearFormatting
.Execute FindText:=BkMkNm, Forward:=True, Format:=False, Wrap:=wdFindStop
End With
If .Found = True Then
.MoveStart wdCharacter, 10
.MoveEnd wdSentence, 1
Set RngDC = .Duplicate
End If
End With
With wD
Set wDRng = .Bookmarks(BkMkNm).Range
wDRng.FormattedText = RngDC.FormattedText
.Bookmarks.Add BkMkNm, wDRng
End With

VBA Code to change word footer in multiple files based on page number

I have a macro that runs to make a single page doc into a 5 page doc (NCR Duplicates) for all files in a folder.
I am using a set of nested IF fields in my footer, which changes the footer based on page number. The field looks like this
Text here {If{PAGE}="1""Original"{If{PAGE}="2""Copy 1"
{If{PAGE}="3""Copy 2"{If{PAGE}="4""Copy 3"{If{PAGE}="5""Copy 4"}}}}}
Other Text
I am trying to figure out how to add this footer to all the documents in a folder. It doesn't need to use field, if there is a way simply based on page number.
I have bashed my head against the wall, searched like crazy, and now come hat in hand.
The macro to make the duplicate copies is:
Sub Make5CopiesNCR()
vDirectory = BrowseForFolder
vFile = Dir(vDirectory & "\" & "*.*")
Do While vFile <> ""
Documents.Open FileName:=vDirectory & "\" & vFile
MakeCopies
vFile = Dir
Loop
End Sub
End Sub
Private Sub MakeCopies()
Dim i As Integer
Selection.WholeStory
Selection.Copy
For i = 1 To 6
Selection.PasteAndFormat wdFormatOriginalFormatting
Next
With ActiveDocument
.GoTo What:=wdGoToPage, Which:=wdGoToAbsolute, Name:=6 'Page number
.Bookmarks("\Page").Select
With Selection
.Delete
ActiveDocument.Close SaveChanges:=wdSaveChanges, OriginalFormat:=wdWordDocument
End With
End With
End Sub
The problem with using a mailmerge with your field construction is that it gets converted to the result. Try a field coded as:
{={PAGE}-1 \# "'Copy {={PAGE}-1}';;'Original'"}
Now, if you create the required 5 pages in your mailmerge main document, all the outputs will likewise be in multiples of 5 pages, with the correct page numbering.
Even if you use a mailmerge main document with only a single page, the outputs will have the field coding required to produce the correct numbering for however many more pages you want to add to the outputs.
As for replicating this in your existing files, simply create a document with the required footer content, then use a macro like:
Sub ReplicateFooter()
Application.ScreenUpdating = False
Dim DocSrc As Document, DocTgt As Document, Rng As Range
Dim StrPth As String, StrNm As String, StrSrc As String
Set DocSrc = ActiveDocument
Set Rng = DocSrc.Sections.First.Footers(wdHeaderFooterPrimary).Range
StrPth = DocSrc.Path & "\": StrSrc = DocSrc.FullName
StrNm = Dir(StrPth & "*.doc", vbNormal)
While StrNm <> ""
If StrPth & StrNm <> StrSrc Then
Set DocTgt = Documents.Open(FileName:=StrPth & StrNm, AddToRecentFiles:=False, Visible:=False)
With DocTgt
With .Sections.First.Footers(wdHeaderFooterPrimary).Range
.FormattedText = Rng.FormattedText
.Characters.Last.Text = vbNullString
End With
.Close True
End With
End If
StrNm = Dir()
Wend
Set Rng = Nothing: Set DocTgt = Nothing: Set DocSrc = Nothing
Application.ScreenUpdating = True
End Sub

Find and Replace Multiple Text Strings on Multiple Text Files from a folder

I'm working on the vba code to accomplish the following tasks
Word Document Open a Text file from the folder
Find and replace the text (multiple Text) based on a excel sheet (which have find what and replace with)
Process all text files in the folder and save it.
I would like to customize the below code for the above requirement,
I'm using Office 2016 and I think I have to replace Application.FileSearch in the script to ApplicationFileSearch for 2003 and prior office editions.
I try to accomplish using the word macro recorder and also used notepad++, I've recorded in Notepad++ also and it works for one file, I would like to do batch process all files in the folder and save it after replacing the text.
As there is too many lines there to replace more than 30 or more lines to replace, I would like the code to look for the text from a excel file like find what and replace with columns.
Sub FindReplaceAllDocsInFolder( )
Dim i As Integer
Dim doc As Document
Dim rng As Range
With Application.FileSearch
.NewSearch
.LookIn = "C:\My Documents"
.SearchSubFolders = False
.FileType = msoFileTypeWordDocuments
If Not .Execute( ) = 0 Then
For i = 1 To .FoundFiles.Count
Set doc = Documents.Open(.FoundFiles(i))
For Each rng In doc.StoryRanges
With rng.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "Dewey & Cheatem"
.Replacement.Text = "Dewey, Cheatem & Howe"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute Replace:=wdReplaceAll
End With
Next rng
doc.Save
doc.Close
Set rng = Nothing
Set doc = Nothing
Next i
Else
MsgBox "No files matched " & .FileName
End If
End With
End Sub
Thanks
Jay
Borrowed from https://social.msdn.microsoft.com/Forums/en-US/62fceda5-b21a-40b6-857c-ad28f12c1b23/use-excel-vba-to-open-a-text-file-and-search-it-for-a-specific-string?forum=isvvba
Sub SearchTextFile()
Const strFileName = "C:\MyFiles\TextFile.txt"
Const strSearch = "Some Text"
Dim strLine As String
Dim f As Integer
Dim lngLine As Long
Dim blnFound As Boolean
f = FreeFile
Open strFileName For Input As #f
Do While Not EOF(f)
lngLine = lngLine + 1
Line Input #f, strLine
If InStr(1, strLine, strSearch, vbBinaryCompare) > 0 Then
MsgBox "Search string found in line " & lngLine, vbInformation
blnFound = True
Exit Do
End If
Loop
Close #f
If Not blnFound Then
MsgBox "Search string not found", vbInformation
End If
End Sub
This is simply just finding the match. You can use the built in function "Replace" which land the total fix. You would also have to fit in the "loop through files" code, which here is a snippet.
Dim StrFile As String
StrFile = Dir(pathtofile & "\*" & ".txt")
Do While Len(StrFile) > 0
Debug.Print StrFile
StrFile = Dir
Loop
I wouldve made this a comment, but it was too much text. This isnt meant to be a full blown answer, just giving you the pieces you need to put it all together on your own.
Thanks for all your help. I have found alternate solution using the below EXE. FART.exe (FART - Find and Replace Text). I have create a batch file with the below command example.
https://emtunc.org/blog/03/2011/farting-the-easy-way-find-and-replace-text/
http://fart-it.sourceforge.net/
Examples:
fart "C:\APRIL2011\Scripts\someFile.txt" oldText newText
This line instructs FART to simply replace the string oldText with newText.
fart -i -r "C:\APRIL2011\Scripts*.prm" march2011 APRIL2011
This line will instruct FART to recursively (-r) search for any file with the .prm extension in the \Scripts folder. The -i flag tells FART to ignore case-sensitivity when looking for the string.

How can I find & replace text across multiple files inside text boxes?

I would like to change text that repeats in .doc and .docx files.
I have this macro running at the moment:
Option Explicit
Public Sub BatchReplaceAll()
Dim FirstLoop As Boolean
Dim myFile As String
Dim PathToUse As String
Dim myDoc As Document
Dim Response As Long
PathToUse = "C:\Files\"
'Error handler to handle error generated whenever
'the FindReplace dialog is closed
On Error Resume Next
'Close all open documents before beginning
Documents.Close SaveChanges:=wdPromptToSaveChanges
'Boolean expression to test whether first loop
'This is used so that the FindReplace dialog will'only be displayed for the first document
FirstLoop = True
'Set the directory and type of file to batch process
myFile = Dir$(PathToUse & "*.docx")
While myFile <> ""
'Open document
Set myDoc = Documents.Open(PathToUse & myFile)
If FirstLoop Then
'Display dialog on first loop only
Dialogs(wdDialogEditReplace).Show
FirstLoop = False
Response = MsgBox("Do you want to process " & _
"the rest of the files in this folder", vbYesNo)
If Response = vbNo Then Exit Sub
Else
'On subsequent loops (files), a ReplaceAll is
'executed with the original settings and without
'displaying the dialog box again
With Dialogs(wdDialogEditReplace)
.ReplaceAll = 1
.Execute
End With
End If
'Close the modified document after saving changes
myDoc.Close SaveChanges:=wdSaveChanges
'Next file in folder
myFile = Dir$()
Wend
End Sub
It does work if replacing simple text.
How can I search and replace inside Text Boxes?
EDIT 1:
I changed this:
With Dialogs(wdDialogEditReplace)
.ReplaceAll = 1
.Execute
End With
To this:
With Dialogs(wdDialogEditReplace)
For Each myStoryRange In ActiveDocument.StoryRanges
With myStoryRange.Find
.Text = "ORIGINAL_TEXT"
.Replacement.Text = "MODIFIED_TEXT"
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
Do While Not (myStoryRange.NextStoryRange Is Nothing)
Set myStoryRange = myStoryRange.NextStoryRange
With myStoryRange.Find
.Text = "ORIGINAL_TEXT"
.Replacement.Text = "MODIFIED_TEXT"
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
Loop
Next myStoryRange
End With
The problems with this new code is that sometimes it skips text boxes and it is slow.
In VBA, Word "Text Boxes" are known as a TextFrame Object. This may be able to point you in the right direction:
For Each s In ActiveDocument.Shapes
With s.TextFrame
If .HasText Then MsgBox .TextRange.Text
End With
Next
I will update my answer when I get more information on implementing it into your example.