VB.Net Split word document into separate documents(pagebreak as delimiter) - vb.net

A word document has several pages. How to split this pages into separate documents using VB.Net ?
I wish to automate this process.
I used ms tutorial for basic learning: http://support.microsoft.com/kb/316383
But i do not know how to find page breaks in a document and move content of that page to separate document.

Solution:
Private Sub ParseWordDoc(ByVal Filename As String, ByVal NewFileName As String)
Dim WordApp As Microsoft.Office.Interop.Word.Application = New Microsoft.Office.Interop.Word.Application()
Dim BaseDoc As Microsoft.Office.Interop.Word.Document
Dim DestDoc As Microsoft.Office.Interop.Word.Document
Dim intNumberOfPages As Integer
Dim intNumberOfChars As String
Dim intPage As Integer
'Word Constants
Const wdGoToPage = 1
Const wdStory = 6
Const wdExtend = 1
Const wdCharacter = 1
'Show WordApp
WordApp.ShowMe()
'Load Base Document
BaseDoc = WordApp.Documents.Open(Filename)
BaseDoc.Repaginate()
'Loop through pages
intNumberOfPages = BaseDoc.BuiltInDocumentProperties("Number of Pages").value
intNumberOfChars = BaseDoc.BuiltInDocumentProperties("Number of Characters").value
For intPage = 1 To intNumberOfPages
If intPage = intNumberOfPages Then
WordApp.Selection.EndKey(wdStory)
Else
WordApp.Selection.GoTo(wdGoToPage, 2)
Application.DoEvents()
WordApp.Selection.MoveLeft(Unit:=wdCharacter, Count:=1)
End If
Application.DoEvents()
WordApp.Selection.HomeKey(wdStory, wdExtend)
Application.DoEvents()
WordApp.Selection.Copy()
Application.DoEvents()
'Create New Document
DestDoc = WordApp.Documents.Add
DestDoc.Activate()
WordApp.Selection.Paste()
DestDoc.SaveAs(NewFileName & intPage.ToString & ".doc")
DestDoc.Close()
DestDoc = Nothing
WordApp.Selection.GoTo(wdGoToPage, 2)
Application.DoEvents()
WordApp.Selection.HomeKey(wdStory, wdExtend)
Application.DoEvents()
WordApp.Selection.Delete()
Application.DoEvents()
Next
BaseDoc.Close(False)
BaseDoc = Nothing
WordApp.Quit()
WordApp = Nothing
End Sub
Credit goes to "Jay Taplin"

Related

VB.Net: Searching Word Document By Line

I'm attempting to read through a Word Document (800+ pages) line by line, and if that line contains certain text, in this case Section, simply print that line to console.
Public Sub doIt()
SearchFile("theFilePath", "Section")
Console.WriteLine("SHit")
End Sub
Public Sub SearchFile(ByVal strFilePath As String, ByVal strSearchTerm As String)
Dim sr As StreamReader = New StreamReader(strFilePath)
Dim strLine As String = String.Empty
For Each line As String In sr.ReadLine
If line.Contains(strSearchTerm) = True Then
Console.WriteLine(line)
End If
Next
End Sub
It runs, but it doesn't print out anything. I know the word "Section" is in there multiple times as well.
As already mentioned in the comments, you can't search a Word document the way you are currently doing. You need to create a Word.Application object as mentioned and then load the document so you can search it.
Here is a short example I wrote for you. Please note, you need to add reference to Microsoft.Office.Interop.Word and then you need to add the import statement to your class. For example Imports Microsoft.Office.Interop. Also this grabs each paragraph and then uses the range to look for the word you are searching for, if found it adds it to the list.
Note: Tried and tested - I had this in a button event, but put where you need it.
Try
Dim objWordApp As Word.Application = Nothing
Dim objDoc As Word.Document = Nothing
Dim TextToFind As String = YOURTEXT
Dim TextRange As Word.Range = Nothing
Dim StringLines As New List(Of String)
objWordApp = CreateObject("Word.Application")
If objWordApp IsNot Nothing Then
objWordApp.Visible = False
objDoc = objWordApp.Documents.Open(FileName, )
End If
If objDoc IsNot Nothing Then
'loop through each paragraph in the document and get the range
For Each p As Word.Paragraph In objDoc.Paragraphs
TextRange = p.Range
TextRange.Find.ClearFormatting()
If TextRange.Find.Execute(TextToFind, ) Then
StringLines.Add(p.Range.Text)
End If
Next
If StringLines.Count > 0 Then
MessageBox.Show(String.Join(Environment.NewLine, StringLines.ToArray()))
End If
objDoc.Close()
objWordApp.Quit()
End If
Catch ex As Exception
'publish your exception?
End Try
Update to use Sentences - this will go through each paragraph and grab each sentence, then we can see if the word exists... The benefit of this is it's quicker because we get each paragraph and then search the sentences. We have to get the paragraph in order to get the sentences...
Try
Dim objWordApp As Word.Application = Nothing
Dim objDoc As Word.Document = Nothing
Dim TextToFind As String = "YOUR TEXT TO FIND"
Dim TextRange As Word.Range = Nothing
Dim StringLines As New List(Of String)
Dim SentenceCount As Integer = 0
objWordApp = CreateObject("Word.Application")
If objWordApp IsNot Nothing Then
objWordApp.Visible = False
objDoc = objWordApp.Documents.Open(FileName, )
End If
If objDoc IsNot Nothing Then
For Each p As Word.Paragraph In objDoc.Paragraphs
TextRange = p.Range
TextRange.Find.ClearFormatting()
SentenceCount = TextRange.Sentences.Count
If SentenceCount > 0 Then
Do Until SentenceCount = 0
Dim sentence As String = TextRange.Sentences.Item(SentenceCount).Text
If sentence.Contains(TextToFind) Then
StringLines.Add(sentence.Trim())
End If
SentenceCount -= 1
Loop
End If
Next
If StringLines.Count > 0 Then
MessageBox.Show(String.Join(Environment.NewLine, StringLines.ToArray()))
End If
objDoc.Close()
objWordApp.Quit()
End If
Catch ex As Exception
'publish your exception?
End Try
Here's a sub that will print each line that the search-string is found on, rather than each paragraph. It will mimic the behavior of using the streamreader in your example to read/check each line:
'Add reference to and import Microsoft.Office.Interop.Word
Public Sub SearchFile(ByVal strFilePath As String, ByVal strSearchTerm As String)
Dim wordObject As Word.Application = New Word.Application
wordObject.Visible = False
Dim objWord As Word.Document = wordObject.Documents.Open(strFilePath)
objWord.Characters(1).Select()
Dim bolEOF As Boolean = False
Do Until bolEOF
wordObject.Selection.MoveEnd(WdUnits.wdLine, 1)
If wordObject.Selection.Text.ToUpper.Contains(strSearchTerm.ToUpper) Then
Console.WriteLine(wordObject.Selection.Text.Replace(vbCr, "").Replace(vbCr, "").Replace(vbCrLf, ""))
End If
wordObject.Selection.Collapse(WdCollapseDirection.wdCollapseEnd)
If wordObject.Selection.Bookmarks.Exists("\EndOfDoc") Then
bolEOF = True
End If
Loop
objWord.Close()
wordObject.Quit()
objWord = Nothing
wordObject = Nothing
Me.Close()
End Sub
It is a slightly modified vb.net implementation of nawfal's solution to parsing word document lines

VBA code to wait until file download from IE is complete

I'm trying to download an excel file from a webpage and so far I was able to open the webpage, navigate and click on save button but I need to access that excel file once it is downloaded. But sometimes it takes time to download depending on the size of the file. Is there any way we can check the window and see if the download is complete and only then to proceed to open the downloaded file. Below is the code.
Dim o As IUIAutomation
Dim e As IUIAutomationElement
Set o = New CUIAutomation
h = IE.hwnd
h = FindWindowEx(h, 0, "Frame Notification Bar", vbNullString)
If h = 0 Then
MsgBox "Not Found"
End If
Set e = o.ElementFromHandle(ByVal h)
Dim iCnd As IUIAutomationCondition
Set iCnd = o.CreatePropertyCondition(UIA_NamePropertyId, "Save")
Dim Button As IUIAutomationElement
Set Button = e.FindFirst(TreeScope_Subtree, iCnd)
Dim InvokePattern As IUIAutomationInvokePattern
Set InvokePattern = Button.GetCurrentPattern(UIA_InvokePatternId)
InvokePattern.Invoke
The above code will save the download file
This code uses a similar technique to what you have, started, but in addition it will wait for the "Open folder" button to appear in the 'Frame Notification Bar', which will indicate the download is finished. Then it looks in the User's Download folder for a 'very recently added' file and moves it to the place you select. The Code has some Debug.Print statements for error messages, which you may want to change/remove.
Hope this works for you....
Option Explicit
'--Given an IE browser object with the yellow 'Frame Notification Bar' to download file and a File Name to save the downloaded file to,
'--This Sub will use UIAutomation to click the Save button, then wiat for the Open button, then look in the User Downloads folder
'--to get the file just downloaded, then move it to the full file name path given in Filename, then close the 'Frame Notification Bar'
'--DownloadFromIEFrameNotificationBar will return the following codes:
'-- -1 - could not find the Close button in the 'Frame Notification Bar', but file saved OK
'-- 0 - succesfully downloaded and save file
'-- 1 - could not find the 'Frame Notification Bar'
'-- 2 - could not find the Save button in the 'Frame Notification Bar'
'-- 3 - could not find the 'Open folder' button in the 'Frame Notification Bar'
'-- 4 - could not find Very recent file (Last modified within 3 seconds) in the User Downloads folder
Public Function DownloadFromIEFrameNotificationBar(ByRef oBrowser As InternetExplorer, Filename As String) As Long
Dim UIAutomation As IUIAutomation
Dim eBrowser As IUIAutomationElement, eFNB As IUIAutomationElement, e As IUIAutomationElement
Dim InvokePattern As IUIAutomationInvokePattern
Dim DLfn As String
DownloadFromIEFrameNotificationBar = 0
Set UIAutomation = New CUIAutomation
Set eBrowser = UIAutomation.ElementFromHandle(ByVal oBrowser.hwnd)
'--Find 'Frame Notification Bar' element
Set eFNB = FindFromAllElementsWithClassName(eBrowser, "Frame Notification Bar", 10)
If eFNB Is Nothing Then
Debug.Print "'Frame Notification Bar' not found"
DownloadFromIEFrameNotificationBar = 1
Exit Function
End If
'--Find 'Save' button element
Set e = FindFromAllElementWithName(eFNB, "Save")
If e Is Nothing Then
Debug.Print "'Save' button not found"
DownloadFromIEFrameNotificationBar = 2
Exit Function
End If
'--'Click' the 'Save' button
Sleep 100
Set InvokePattern = e.GetCurrentPattern(UIA_InvokePatternId)
InvokePattern.Invoke
'--Wait for the file to download by waiting for the 'Open Folder' button to appear in the 'Frame Notification Bar'
Set e = FindFromAllElementWithName(eFNB, "Open folder", 15)
If e Is Nothing Then
Debug.Print "'Open Folder' button not found"
DownloadFromIEFrameNotificationBar = 3
Exit Function
End If
'--Done with download, now look for a file that was very recently (with in 3 seconds) added to the User's Downloads folder and get the file name of it
DLfn = FindVeryRecentFileInDownloads()
If DLfn <> "" Then
'--We got recent downloaded file, now Delete the file we are saving too (if it exists) so the Move file will be successful
DeleteFile Filename
MoveFile DLfn, Filename
Else
Debug.Print "Very recent file not found!"
DownloadFromIEFrameNotificationBar = 4
End If
'--Close Notification Bar window
Set e = FindFromAllElementWithName(eFNB, "Close")
If e Is Nothing Then
Debug.Print "'Close' button not found"
DownloadFromIEFrameNotificationBar = -1
Exit Function
End If
'--'Click' the 'Close' button
Sleep 100
Set InvokePattern = e.GetCurrentPattern(UIA_InvokePatternId)
InvokePattern.Invoke
End Function
Private Function FindFromAllElementWithName(e As IUIAutomationElement, n As String, Optional MaxTime As Long = 5) As IUIAutomationElement
Dim oUIAutomation As New CUIAutomation
Dim ea As IUIAutomationElementArray
Dim i As Long, timeout As Date
timeout = Now + TimeSerial(0, 0, MaxTime)
Do
Set ea = e.FindAll(TreeScope_Subtree, oUIAutomation.CreateTrueCondition)
For i = 0 To ea.length - 1
If ea.GetElement(i).CurrentName = n Then
Set FindFromAllElementWithName = ea.GetElement(i)
Exit Function
End If
Next
DoEvents
Sleep 20
Loop Until Now > timeout
Set FindFromAllElementWithName = Nothing
End Function
Private Function FindFromAllElementsWithClassName(e As IUIAutomationElement, c As String, Optional MaxTime As Long = 5) As IUIAutomationElement
Dim oUIAutomation As New CUIAutomation
Dim ea As IUIAutomationElementArray
Dim i As Long, timeout As Date
timeout = Now + TimeSerial(0, 0, MaxTime)
Do
Set ea = e.FindAll(TreeScope_Subtree, oUIAutomation.CreateTrueCondition)
For i = 0 To ea.length - 1
If ea.GetElement(i).CurrentClassName = c Then
Set FindFromAllElementsWithClassName = ea.GetElement(i)
Exit Function
End If
Next
DoEvents
Sleep 20
Loop Until Now > timeout
Set FindFromAllElementsWithClassName = Nothing
End Function
Private Function FindVeryRecentFileInDownloads(Optional MaxSecs As Long = 3) As String
Dim fso As New FileSystemObject, f As File, First As Boolean, lfd As Date, Folder As String
Dim WS As Object
On Error GoTo errReturn
Set WS = CreateObject("WScript.Shell")
'--Get Current user's Downloads folder path
Folder = WS.RegRead("HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders\{374DE290-123F-4565-9164-39C4925E467B}")
First = True
For Each f In fso.GetFolder(Folder).Files
If First Then
lfd = f.DateLastModified
FindVeryRecentFileInDownloads = f.Path
First = False
ElseIf f.DateLastModified > lfd Then
lfd = f.DateLastModified
FindVeryRecentFileInDownloads = f.Path
End If
Next
If First Then
FindVeryRecentFileInDownloads = "" '--no files
ElseIf MaxSecs <> -1 And DateDiff("s", lfd, Now) > MaxSecs Then
FindVeryRecentFileInDownloads = "" '--no very recent file found
End If
Exit Function
errReturn:
FindVeryRecentFileInDownloads = ""
End Function
Private Sub MoveFile(SourcePath As String, DestinationPath As String)
Dim fso As New FileSystemObject
CreateCompletePath Left(DestinationPath, InStrRev(DestinationPath, Application.PathSeparator))
fso.MoveFile SourcePath, DestinationPath
End Sub
Public Sub CreateCompletePath(sPath As String)
Dim iStart As Integer
Dim aDirs As Variant
Dim sCurDir As String
Dim i As Integer
sPath = Trim(sPath)
If sPath <> "" And Dir(sPath, vbDirectory) = vbNullString Then
aDirs = Split(sPath, Application.PathSeparator)
If Left(sPath, 2) = Application.PathSeparator & Application.PathSeparator Then
iStart = 3
Else
iStart = 1
End If
sCurDir = Left(sPath, InStr(iStart, sPath, Application.PathSeparator))
For i = iStart To UBound(aDirs)
If Trim(aDirs(i)) <> vbNullString Then
sCurDir = sCurDir & aDirs(i) & Application.PathSeparator
If Dir(sCurDir, vbDirectory) = vbNullString Then MkDir sCurDir
End If
Next i
End If
End Sub

Outlook VBA add hyperlink of chosen file in dialog

I'm trying to add the functionality in my Outlook (with VBA, I guess is easiest) to add a simple file dialog which takes the path of any files chosen and adds them to the email body as a hyperlink.
The idea of this is for network files to be shared amongst colleagues, instead of attaching them to the email, but just as easy to do.
This is my code so far, I can't even get the dialog to open, and I've had a good look at trying to get COMDLG32.ocx, so far I can't seem to make anything work.
Sub Main2()
Dim CDLG As Object
Set CDLG = CreateObject("MSComDlg.CommonDialog")
With CDLG
.DialogTitle = "Get me a File!"
.Filter = _
"Documents|*.doc|Templates|*.dot|Text Files|*.txt"
.ShowOpen
MsgBox .FileName
End With
Set CDLG = Nothing
End Sub
Thanks in advance, hopefully someone can show me how this is done!
Just for those who need it; OS Windows 10, Office 2010 H&B (yes, I know it's out of date :))
There seems to be no direct way to open a FileDialog in Outlook 2010 VBA.
The following macro (inspired by a related post) makes use of Excel to circumvent this:
Public Function promptFileName(title As String, filter As String) As String
' requires project reference to "Microsoft Excel 14.0 Object Library"
Dim xlObj As Excel.Application
Dim fd As Office.FileDialog
Dim name As String
Dim vItem As Variant
Dim filterArray() As String
Dim i As Integer
Set xlObj = New Excel.Application
xlObj.Visible = False
Set fd = xlObj.Application.FileDialog(msoFileDialogOpen)
name = ""
With fd
.title = title
.ButtonName = "Ok"
.Filters.Clear
filterArray = Split(filter, "|")
For i = LBound(filterArray) To UBound(filterArray) - 1 Step 2
.Filters.Add filterArray(i), filterArray(i + 1), 1 + i \ 2
Next i
If .Show = -1 Then
For Each vItem In .SelectedItems
name = vItem
Exit For
Next
End If
End With
xlObj.Quit
Set xlObj = Nothing
promptFileName = name
End Function
Private Sub testPromptFile
Dim name as String
name = promptFileName("a test", "Text Files (*.txt)|*.txt|All Files (*.*)|*.*")
MsgBox name
End Sub
Outlook 2013 and beyond provide an Office.FileDialog class for this purpose.
You can press a button with Outlook VBA.
Sub ExecuteMso_strId()
Dim objItem As Object
Dim strId As String
' Text appears when hovering over icon
' when adding buttons to a Quick Access toolbar or a ribbon
strId = "HyperlinkInsert"
On Error Resume Next
Set objItem = ActiveInspector.currentItem
On Error GoTo 0
If Not objItem Is Nothing Then
ActiveInspector.CommandBars.ExecuteMso (strId)
Else
ActiveExplorer.CommandBars.ExecuteMso (strId)
End If
End Sub
With this you do not have access to the parameters as with Excel.

Word Automation with .NET - Single Landscape pages

I'm trying to set a single page to be landscape, however every method I try changes the entire document rather than the section\paragraph I set PageSetup on. Here's an example, the first page should be portrait and the second should be landscape:
Dim wrdApp As Word.Application
Dim wrdDoc As Word._Document
Public Sub test()
Dim wrdSelection As Word.Selection
Dim wrdDataDoc As Word._Document
Dim sText As String
wrdApp = CreateObject("Word.Application")
wrdApp.Visible = True
wrdDataDoc = wrdApp.Documents.Open("C:\Temp\Doc1.docx")
wrdDataDoc.PageSetup.Orientation = WdOrientation.wdOrientPortrait
Dim oPara1 As Paragraph
sText = "Test Report Title"
oPara1 = wrdDataDoc.Content.Paragraphs.Add
oPara1.Range.Text = sText
oPara1.Range.ParagraphFormat.Alignment = Microsoft.Office.Interop.Word.WdParagraphAlignment.wdAlignParagraphCenter
oPara1.Range.InsertParagraphAfter()
Dim para As Word.Paragraph = wrdDataDoc.Paragraphs.Add()
para.Range.InsertBreak()
wrdApp.Selection.GoTo(Word.WdGoToItem.wdGoToLine, Word.WdGoToDirection.wdGoToLast)
wrdDataDoc.Sections(1).PageSetup.Orientation = WdOrientation.wdOrientLandscape
wrdSelection = wrdApp.Selection()
wrdDataDoc.Tables.Add(wrdSelection.Range, NumRows:=9, NumColumns:=4)
With wrdDataDoc.Tables.Item(1)
'Code for table here
End With
End Sub
You need to insert a page break, try this:
oPara1.Range.InsertBreak Type:=wdSectionBreakNextPage
wrdDataDoc.Sections(wrdDataDoc.Sections.Count).PageSetup.Orientation = wdOrientLandscape

Closing Word app, vb.net

i am trying to search within word documents, using vb.net it worked but i cant seem to close the files after searching them, here is the code i use
how to close the word apps after being searched ?
Dim oWord As Word.Application = Nothing
Dim oDocs As Word.Documents = Nothing
Dim oDoc As Word.Document = Nothing
Dim folderDlg As New FolderBrowserDialog
folderDlg.ShowNewFolderButton = True
If (folderDlg.ShowDialog() = DialogResult.OK) Then
Dim root As Environment.SpecialFolder = folderDlg.RootFolder
End If
Dim l_Dir As IO.DirectoryInfo
Dim fldpath As String = folderDlg.SelectedPath
If IO.Directory.Exists(fldpath) Then
l_Dir = New IO.DirectoryInfo(fldpath)
For Each l_File In Directory.GetFiles(fldpath, "*.docx")
Dim searchFor As String = TextBox1.Text
oWord = New Word.Application()
oWord.Visible = False
oDocs = oWord.Documents
oDoc = oDocs.Open(l_File, False)
oDoc.Content.Find.ClearFormatting()
Dim findText As String = searchFor
Try
If oDoc.Content.Find.Execute(findText) = True Then
MessageBox.Show("OK.")
oWord.NormalTemplate.Saved = True
oWord.ActiveDocument.Close(False)
oDoc.Close()
oWord.Quit()
If Not oDoc Is Nothing Then
Marshal.FinalReleaseComObject(oDoc)
oDoc = Nothing
End If
If Not oDocs Is Nothing Then
Marshal.FinalReleaseComObject(oDocs)
oDocs = Nothing
End If
If Not oWord Is Nothing Then
Marshal.FinalReleaseComObject(oWord)
oWord = Nothing
End If
Else
MessageBox.Show("No.")
End If
Catch ex As Exception
End Try
ComboBox1.Items.Add(l_File)
Next
End If
First thing you should bear in mind is that "releasing the objects" in Word is not as difficult as in Excel and thus you are (unnecessarily) over-complicating things. In any case, you should intend to not over-declare variables (what is the exact point of oDocs?). And, lastly, you should always perform a step-by-step execution when things go wrong to find out what might be happening (you are applying your "objects release" only for "OK" cases, not in any situation: when the result is "No", the objects would have to be released too).
Here you have a corrected code accounting for all the aforementioned issues:
Dim oWord As Word.Application = Nothing
Dim oDoc As Word.Document = Nothing
Dim folderDlg As New FolderBrowserDialog
folderDlg.ShowNewFolderButton = True
If (folderDlg.ShowDialog() = DialogResult.OK) Then
Dim root As Environment.SpecialFolder = folderDlg.RootFolder
End If
Dim l_Dir As IO.DirectoryInfo
Dim fldpath As String = folderDlg.SelectedPath
If IO.Directory.Exists(fldpath) Then
l_Dir = New IO.DirectoryInfo(fldpath)
For Each l_File In Directory.GetFiles(fldpath, "*.docx")
Dim searchFor As String = TextBox1.Text
oWord = New Word.Application()
oWord.Visible = False
Try
oDoc = oWord.Documents.Open(l_File, False)
oDoc.Content.Find.ClearFormatting()
Dim findText As String = searchFor
Try
If oDoc.Content.Find.Execute(findText) = True Then
MessageBox.Show("OK.")
Else
MessageBox.Show("No.")
End If
Catch ex As Exception
End Try
oWord.NormalTemplate.Saved = True
ComboBox1.Items.Add(l_File)
Catch ex As Exception
End Try
oDoc = Nothing
oWord.Quit()
oWord = Nothing
Next
End If
NOTE: note that, when iterating through all the Word files in a folder (and, in general, ones from any MS Office program), you can find temporary copies (starting with "~$...") which might trigger an error when being opened (and thus not allow the object-releasing part to come into picture); also, in general, when opening files something might go wrong; this is what explains the new try...catch I added and why I put the releasing part after it.
I don't know Vb.net, but in F# I used
System.Runtime.InteropServices.Marshal.ReleaseComObject xlApp |> ignore
after ".Quit()", to end the process, but you must search, how to give the name of your word application at the place of xlApp
I will place this after the "End If" at the end.
I hope it will be helpfull for you