split pdf based on the text using vba acrobat api - vba

I am trying to split pdf, based on the pages where it finds ".pdf" however when I try to save the pdf with a dynamic string variable, it do not save the pdf but when I write hardcode file path it output the pdf. do not know what is going on here.
the following code is not finished yet I am stuck in creating new pdf with deleted pages:
Function Extract_PDF()
Dim aApp As Acrobat.CAcroApp
Dim av_Doc As Acrobat.CAcroAVDoc
Dim pdf_Doc As Acrobat.CAcroPDDoc '
Dim newPDFdoc As Acrobat.CAcroPDDoc
Dim Sel_Text As Acrobat.CAcroPDTextSelect
Dim i As Long, j As Long
Dim pageNum, Content
Dim pageContent As Acrobat.CAcroHiliteList
Dim found As Boolean
Dim foundPage As Integer
Dim PDF_Path As String
Dim pdfName As String
Dim folerPath As String
Dim FileExplorer As FileDialog
Set FileExplorer = Application.FileDialog(msoFileDialogFilePicker)
With FileExplorer
.AllowMultiSelect = False
.InitialFileName = ActiveDocument.Path
.Filters.Clear
.Filters.Add "PDF File", "*.pdf"
If .Show = -1 Then
PDF_Path = .SelectedItems.Item(1)
Else
PagesLB = "Catch me Next Time ;)"
PDF_Path = ""
Exit Function
End If
End With
Set aApp = CreateObject("AcroExch.App")
Set av_Doc = CreateObject("AcroExch.AVDoc")
If av_Doc.Open(PDF_Path, vbNull) <> True Then Exit Function
While av_Doc Is Nothing
Set av_Doc = aApp.GetActiveDoc
Wend
av_Doc.BringToFront
aApp.Show
Set pdf_Doc = av_Doc.GetPDDoc
For i = pdf_Doc.GetNumPages - 1 To 0 Step -1
Set pageNum = pdf_Doc.AcquirePage(i)
Set pageContent = CreateObject("AcroExch.HiliteList")
If pageContent.Add(0, 9000) <> True Then Exit Function
Set Sel_Text = pageNum.CreatePageHilite(pageContent)
Content = ""
found = False
For j = 0 To Sel_Text.GetNumText - 1
Content = Content & Sel_Text.GetText(j)
If InStr(1, Content, ".pdf") > 0 Then
found = True
foundPage = i
pdfName = Content
Exit For
End If
Next j
If found Then
PDF_Path = Left(PDF_Path, InStrRev(PDF_Path, "\")) & ValidWBName(pdfName)
Set newPDFdoc = CreateObject("AcroExch.PDDoc")
Set newPDFdoc = av_Doc.GetPDDoc
If newPDFdoc.DeletePages(0, i - 1) = False Then
Debug.Print "Failed"
Else
Debug.Print "done"
End If
If newPDFdoc.Save(PDSaveFull, PDF_Path) = False Then
Debug.Print "Failed to save pdf "
Else
Debug.Print "Saved"
End If
newPDFdoc.Close
End If
Next i
av_Doc.Close False
aApp.Exit
Set av_Doc = Nothing
Set pdf_Doc = Nothing
Set aApp = Nothing
End Function
ValidWBName:
Function ValidWBName(agr As String) As String
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Pattern = "[\\/:\*\?""<>\|]"
.Global = True
ValidWBName = .Replace(agr, "")
End With
End Function
in above function when it finds the word pdf it try to create a new instance of pdf and remove previous pages.
If found Then
PDF_Path = Left(PDF_Path, InStrRev(PDF_Path, "\")) & ValidWBName(pdfName)
Set newPDFdoc = CreateObject("AcroExch.PDDoc")
Set newPDFdoc = av_Doc.GetPDDoc
If newPDFdoc.DeletePages(0, i - 1) = False Then
Debug.Print "Failed"
Else
Debug.Print "done"
End If
If newPDFdoc.Save(PDSaveFull, PDF_Path) = False Then
Debug.Print "Failed to save pdf "
Else
Debug.Print "Saved"
End If
newPDFdoc.Close
End If
this line "Failed to save pdf"
If newPDFdoc.Save(PDSaveFull, PDF_Path) = False Then
but when I write hardcode path it create the pdf
If newPDFdoc.Save(PDSaveFull, "C:\Users\MBA\Desktop\PDF Project 2\Murdoch_Michael__Hilary_PIA_19.pdf") = False Then

the culprit HAD to be in ValidWBName() function, which didn't handle all possible not allowed chars for a valid file name
since it came out vbCr char was one of them, you could change it as follows:
Function ValidWBName(agr As String) As String
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Pattern = "[\\/:\*\?""<>\|" & Chr(13) & "]" ' <-- added vbCr
.Global = True
ValidWBName = .Replace(agr, "")
End With
End Function

Related

split a pdf based on the text using vba acrobat api

I am trying to split pdf, based on the pages where it finds ".pdf". some of pages in the PDF contains first line with with file name "abc1.pdf" or "abc2.pdf" and so on.
The Function Extract_PDF() need to open one instance of selected PDF, go to the last page, get its text to check if it contains ".pdf". if yes then make another instance of same PDF and remove its previous pages.
For the first instance only remove the page that contains ".pdf" till the last page. save both instances. do it in loop until split all the letters.
In nutshell, This code creates a new instance of the PDF using the pdf_Doc.Create method (here it get an error type mismatch), and copies the pages from the original pdf to the new pdf instance up to the page where ".pdf" is found. Then it saves this new pdf instance with a new name.
It also removes the pages from the original pdf starting from the page where ".pdf" is found till the last page. Finally, it saves the modified original pdf.
I don't know how to handle the .Create method, let me know where it get wrong? how I can optimize this code.
Function Extract_PDF()
Dim aApp As Acrobat.CAcroApp
Dim av_Doc As Acrobat.CAcroAVDoc
Dim pdf_Doc As Acrobat.CAcroPDDoc '
Dim newPDFdoc As Acrobat.CAcroPDDoc ' new variable for the new pdf instance
Dim Sel_Text As Acrobat.CAcroPDTextSelect
Dim i As Long, j As Long
Dim pageNum, Content
Dim pageContent As Acrobat.CAcroHiliteList
Dim found As Boolean
Dim foundPage As Integer
Dim PDF_Path As String
Dim PDF_Paths As String
Dim newPDF_Paths As String ' new variable for the new pdf path
Dim pdfName As String
Dim folerPath As String
Dim FileExplorer As FileDialog
Set FileExplorer = Application.FileDialog(msoFileDialogFilePicker)
With FileExplorer
.AllowMultiSelect = False
.InitialFileName = ActiveDocument.Path
.Filters.Clear
.Filters.Add "PDF File", "*.pdf"
If .Show = -1 Then
PDF_Path = .SelectedItems.Item(1)
Else
PagesLB = "Catch me Next Time ;)"
PDF_Path = ""
Exit Function
End If
End With
Set aApp = CreateObject("AcroExch.App")
Set av_Doc = CreateObject("AcroExch.AVDoc")
' Error handling
'On Error GoTo CleanUp
If av_Doc.Open(PDF_Path, vbNull) <> True Then Exit Function
Set pdf_Doc = av_Doc.GetPDDoc
totalPages = pdf_Doc.GetNumPages
For i = totalPages - 1 To 0 Step -1
Set pageNum = pdf_Doc.AcquirePage(i)
Set pageContent = CreateObject("AcroExch.HiliteList")
If pageContent.Add(0, 9000) <> True Then Exit Function
Set Sel_Text = pageNum.CreatePageHilite(pageContent)
Content = ""
found = False
For j = 0 To Sel_Text.GetNumText - 1
Content = Content & Sel_Text.GetText(j)
If InStr(1, Content, ".pdf") > 0 Then
found = True
foundPage = i
pdfName = Content
Exit For
End If
Next j
If found Then
PDF_Paths = Left(PDF_Path, InStrRev(PDF_Path, "\")) & ValidWBName(pdfName)
newPDF_Paths = Left(PDF_Path, InStrRev(PDF_Path, "\")) & ValidWBName(pdfName)
Set pdf_Doc = av_Doc.GetPDDoc
' create new instance of pdf
Set newPDFdoc = pdf_Doc.Create 'ERROR LINE "Type Mismatch"
' copy pages from the original pdf to the new pdf
If newPDFdoc.InsertPages(0, pdf_Doc, 0, foundPage - 1, True) = False Then
Debug.Print "Failed"
End If
' save new pdf
If newPDFdoc.Save(PDSaveFull, newPDF_Paths) = False Then
Debug.Print "Failed to save new pdf "
Else
Debug.Print "New pdf saved"
End If
' remove pages from the original pdf
If pdf_Doc.DeletePages(foundPage, totalPages - foundPage) = False Then
Debug.Print "Failed to remove pages from original pdf"
Else
Debug.Print "Pages removed from original pdf"
End If
'save the original pdf
If pdf_Doc.Save(PDSaveFull, PDF_Paths) = False Then
Debug.Print "Failed to save original pdf "
Else
Debug.Print "Original pdf saved"
End If
End If
Next i
CleanUp:
' code to handle errors and clean up resources
' for example, close the pdf document and release the objects
av_Doc.Close False
aApp.Exit
Set av_Doc = Nothing
Set pdf_Doc = Nothing
Set aApp = Nothing
End Function
Function ValidWBName(agr As String) As String
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Pattern = "[\\/:\*\?""<>\|" & Chr(13) & Chr(10) & Chr(11) & "]"
.Global = True
ValidWBName = .Replace(agr, "")
End With
End Function

Outlook - VBA set signature to new Email ... so the signature can be changed via menu

I wrote a script where I add a signature from an htm file in the appData ... signature folder to a newly opened email.
My question is - how do i modify this VBA script to add that signature in a way so Outlook knows its a signature and the signature might be changed by a user via gui.
I assume it may have something to do with setting a "_MailAutoSig" bookmark, is that right?
Script looks like this and works so far:
Dim WithEvents m_objMail As Outlook.MailItem
Dim LODGIT_SUBJECT_IDENTIFIERS() As String
Private Sub Application_ItemLoad(ByVal Item As Object)
'MsgBox "Application_ItemLoad"
Select Case Item.Class
Case olMail
Set m_objMail = Item
End Select
End Sub
Private Sub m_objMail_Open(Cancel As Boolean)
'string array containing lodgit email subject identifiers (beginning string!!! of email subject)
LODGIT_SUBJECT_IDENTIFIERS = Split("Angebot von Bödele Alpenhotel,Angebot von,bestätigt Ihre Reservierung,Rechnung Nr.,Stornogutschrift für die Rechnung,Ausstehende Zahlung", ",")
Dim Application As Object
Dim oOutApp As Object, oOutMail As Object
Dim strbody As String, FixedHtmlBody As String
Dim Ret
Set Application = CreateObject("Outlook.Application")
'Change only Mysig.htm to the name of your signature
' C:\Users\nicole\AppData\Roaming\Microsoft\Signatures
Ret = Environ("appdata") & _
"\Microsoft\Signatures\AH Andrea kurz.htm"
If Ret = False Then Exit Sub
'~~> Use the function to fix image paths in the htm file
FixedHtmlBody = FixHtmlBody(Ret)
'CHECK FOR LODGIT IDENTIFIER
If myInStr(m_objMail.Subject, LODGIT_SUBJECT_IDENTIFIERS()) Then
Debug.Print "E-Mail as from Lodgit identified"
Dim str As String
Dim a As Object
str = Replace(m_objMail.Body, vbCrLf, "<br>")
str = Replace(str, vbNewLine, "<br>")
m_objMail.HTMLBody = "<html><body><span style='font-size:11.0pt;font-family:""Times New Roman"" '>" & str & "</span>" & FixedHtmlBody & "</body></html>"
End If
End Sub
'~~> Function to fix image paths in Signature .htm Files
Function FixHtmlBody(r As Variant) As String
Dim FullPath As String, filename As String
Dim FilenameWithoutExtn As String
Dim foldername As String
Dim MyData As String
'~~> Read the html file as text file in a string variable
Open r For Binary As #1
MyData = Space$(LOF(1))
Get #1, , MyData
Close #1
'~~> Get File Name from path
filename = GetFilenameFromPath(r)
'~~> Get File Name without extension
FilenameWithoutExtn = Left(filename, (InStrRev(filename, ".", -1, vbTextCompare) - 1))
'~~> Get the foldername where the images are stored
foldername = FilenameWithoutExtn & "-Dateien"
'~~> Full Path of Folder
FullPath = Left(r, InStrRev(r, "\")) & foldername
'~~> To cater for spaces in signature file name
'FullPath = Replace(FullPath, " ", "%20")
'~~> Replace incomplete path with full Path
FixHtmlBody = Replace(MyData, "AH%20Andrea%20kurz-Dateien", FullPath)
'FixHtmlBody = Replace(MyData, foldername, FullPath)
End Function
'~~> Gets File Name from path
Public Function GetFilenameFromPath(ByVal strPath As String) As String
If Right$(strPath, 1) <> "\" And Len(strPath) > 0 Then _
GetFilenameFromPath = GetFilenameFromPath(Left$(strPath, Len(strPath) - 1)) + Right$(strPath, 1)
End Function
'check if str contains on of the elements of a str array
Public Function myInStr(myString As String, a() As String) As Boolean
For Each elem In a
If InStr(1, myString, elem, vbTextCompare) <> 0 Then
myInStr = True
Exit Function
End If
Next
myInStr = False
End Function
Outlook looks for the "_MailAutoSig" bookmark. This needs to be done with Word Object Model, not by setting the HTMLBody property. Something along the lines:
wdStory = 6
wdMove = 0
Set objBkm = Nothing
Set objDoc = Inspector.WordEditor
Set objSel = objDoc.Application.Selection
'remember the cursor position
set cursorRange = objDoc.Range
cursorRange.Start = objSel.Start
cursorRange.End = objSel.End
If objDoc.Bookmarks.Exists("_MailAutoSig") Then
'replace old signature
Debug.Print "old signature found"
set objBkm = objDoc.Bookmarks("_MailAutoSig")
objBkm.Select
objDoc.Windows(1).Selection.Delete
ElseIf objDoc.Bookmarks.Exists("_MailOriginal") Then
' is there the original email? (_MailOriginal)
set objBkm = objDoc.Bookmarks("_MailOriginal")
objSel.Start = objBkm.Start-2 'give room for the line break before. It includes the line
objSel.End = objBkm.Start-2
Else
'insert at the end of the email
objSel.EndOf wdStory, wdMove
End If
'start bookmark
set bkmStart = objDoc.Bookmarks.Add("_tempStart", objSel.Range)
'end bookmark
set bkmEnd = objDoc.Bookmarks.Add("_tempEnd", objSel.Range)
bkmEnd.End = bkmEnd.End + 1
bkmEnd.Start = bkmEnd.Start + 1
objSel.Text = " "
set objBkm = objDoc.Bookmarks.Add("_MailAutoSig", bkmStart.Range)
objBkm.Range.insertFile "c:\Users\<user>\AppData\Roaming\Microsoft\Signatures\test.htm", , false, false, false
objBkm.Range.InsertParagraphBefore
objBkm.End = bkmEnd.Start - 1 'since we added 1 above for bkmEnd
objSel.Start = cursorRange.Start
objSel.End = cursorRange.End
bkmStart.Delete
bkmEnd.Delete

Inserting images from a network folder - images are inserted not in the good order

I have a macro that, from a chosen folder, input and resize all images inside a word document.
When we choose a folder from a laptop's drive, the macro works well.
However, when we choose a folder stored inside a network (company's network), images aren't insert in the good order.
I would like to display images exactly how they are stored inside the folder: by name order.
As any one have an idea how to fix this problem?
Here is my macro:
Attribute VB_Name = "InsertImagesAnnexes"
Option Explicit
Sub InsertImagesAnnexes()
Dim FolderPath, objFSO, Folder, ImagePath, image, countphoto As Integer
Const END_OF_STORY = 6
Const MOVE_SELECTION = 0
countphoto = 0
FolderPath = Select_Folder_From_Prompt
If InStr(FolderPath, "EMPTY") = 0 Then
Set objFSO = CreateObject("Scripting.Filesystemobject")
Set Folder = objFSO.GetFolder(FolderPath)
For Each image In Folder.Files
ImagePath = image.Path
If CheckiImageExtension(ImagePath) = True Then
Selection.ParagraphFormat.Alignment = wdAlignParagraphCenter
Selection.Font.Name = "Times New Roman"
Selection.Font.Size = "12"
If countphoto = 0 Then
Application.Selection.InsertBreak 'Insert a pagebreak
Selection.TypeText Text:="ANNEXES PHOTOGRAPHIQUES" & Chr(11) & Chr(11)
Selection.ParagraphFormat.Alignment = wdAlignParagraphCenter
Else
Application.Selection.EndKey END_OF_STORY, MOVE_SELECTION
Application.Selection.InlineShapes.AddPicture (ImagePath)
Selection.TypeText Text:=Chr(11) & Chr(11) & "PHOTO N°" & countphoto & Chr(11) & Chr(11) & Chr(11)
End If
countphoto = countphoto + 1
End If
Next
End If
End Sub
Function Select_Folder_From_Prompt() As String
Dim fd, bMultiSelect, CONST_MODEL_DIRECTORY
Set fd = Application.FileDialog(msoFileDialogFolderPicker)
With fd
.Title = "Select a folder"
.AllowMultiSelect = bMultiSelect
.InitialFileName = CONST_MODEL_DIRECTORY
.Filters.Clear
'Use the Show method to display the File Picker dialog box and return the user's action.
'The user pressed the action button.
If .Show = -1 Then
Select_Folder_From_Prompt = .SelectedItems(1) & "\"
Else
Select_Folder_From_Prompt = "EMPTY"
End If
End With
End Function
Function CheckiImageExtension(ImagePath)
Dim varArray ' An array contains iamge file extensions.
Dim varEach ' Each iamge file extension.
Dim blnIsPptFile ' Whether the file extension is image file extension.
Dim objFSO, file, FileExtension
Set objFSO = CreateObject("Scripting.Filesystemobject")
Set file = objFSO.GetFile(ImagePath)
FileExtension = file.Name
blnIsPptFile = False
If FileExtension <> "" Then
varArray = Array(".emf", ".wmf", ".jpg", ".jpeg", ".jfif", ".png", ".jpe", ".bmp", ".dib", ".rle", ".gif", ".emz", ".wmz", ".pcz", ".tif", ".tiff", ".eps", ".pct", ".pict", ".wpg")
For Each varEach In varArray
If InStrRev(UCase(FileExtension), UCase(varEach)) <> 0 Then
blnIsPptFile = True
Exit For
End If
Next
End If
CheckiImageExtension = blnIsPptFile
Set objFSO = Nothing
Set file = Nothing
End Function

Add hyperlinks to linked images

I'm trying to add hyperlinks to images, which were added via IncludePicture fields.
For example, this is an image:
{ IncludePicture "C:\\Test\\Image 1.png" \d }
And so, it should be added hyperlink to it:
C:\\Test\\Image 1.png
After that, I can click on my image in document with mouse, and it will be opened in file manager.
Here is the code. For some reason, it doesn't properly work. How it should be fixed?
Sub AddHyperlinksToImages()
On Error Resume Next
Application.ScreenUpdating = False
Dim iShp As InlineShape
For Each iShp In ActiveDocument.InlineShapes
iShp.Hyperlink.Address = iShp.LinkFormat.SourceFullName 'Doesn't work
'Just for testing
'fullPath = iShp.LinkFormat.SourceFullName
'MsgBox fullPath
Next
Application.ScreenUpdating = True
End Sub
Please try this code.
Sub AddHyperlinksToImages()
' 22 Sep 2017
Dim Fld As Field
Dim FilePath As String
Dim Tmp As String
Dim i As Integer
Application.ScreenUpdating = False
ActiveDocument.Fields.Update
For Each Fld In ActiveDocument.Fields
With Fld
If InStr(1, Trim(.Code), "includepicture", vbTextCompare) = 1 Then
If .InlineShape.Hyperlink Is Nothing Then
i = InStr(.Code, Chr(34))
If i Then
FilePath = Replace(Mid(.Code, i + 1), "\\", "\")
i = InStr(FilePath, "\*")
If i Then FilePath = Left(FilePath, i - 1)
Do While Len(FilePath) > 1
i = Asc(Right(FilePath, 1))
FilePath = Left(FilePath, Len(FilePath) - 1)
If i = 34 Then Exit Do
Loop
If i > 1 Then ActiveDocument.Hyperlinks.Add .InlineShape, FilePath
End If
End If
End If
End With
Next Fld
Application.ScreenUpdating = True
End Sub

Search for a file on Sharepoint and return filename

I had been using the below code to check if a file exists on a SharePoint site:
Function URLExists(url As String) As Boolean
Dim oXHTTP As Object
Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
If Not UCase(url) Like "HTTP:*" Then
url = "http://" & url
End If
On Error GoTo haveError
oXHTTP.Open "HEAD", url, False
oXHTTP.send
URLExists = IIf(oXHTTP.Status = 200, True, False)
Exit Function
haveError:
URLExists = False End Function
The problem now is that the file that I used to download now has the following format:
old url = teams.sharepoint.xyz.com\Daily Report - DDMMYYYY.XLS
new url = teams.sharepoint.xyz.com\Daily Report - v2 YYYY-MM-DD-HH-MM-SS.XLS.XLS
I want to be able to grab the latest file from the server, and I am not sure how to go about doing that using wildcards. It used to work fine with the old url as I could easily format the date, but now that the new url has the Time added to it, I can't figure out a way to search the SharePoint site using, maybe, a wildcard search.
I think I have it worked out now:
Function GetFullFileName(strfilepath As String, strFileNamePartial As String) As String
Dim objFS As Variant
Dim objFolder As Variant
Dim objFile As Variant
Dim intLengthOfPartialName As Integer
Dim strfilenamefull As String
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFS.GetFolder(strfilepath)
'work out how long the partial file name is intLengthOfPartialName = Len(strFileNamePartial)
'Instead of specifying the starting characters of the file you can directly loop through all files in the folder
For Each objFile In objFolder.Files
'Test to see if the file matches the partial file name
If Left(objFile.Name, intLengthOfPartialName) = strFileNamePartial Then
'get the full file name
strfilenamefull = objFile.Name
Exit For
Else
End If
Next objFile
Set objFolder = Nothing
Set objFS = Nothing
'Return the full file name as the function's value
GetFullFileName = strfilenamefull
End Function
Function URLExists(url As String) As Boolean
Dim oXHTTP As Object
Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
If Not UCase(url) Like "HTTP:*" Then
url = "http://" & url
End If
On Error GoTo haveError
oXHTTP.Open "HEAD", url, False
oXHTTP.send
URLExists = IIf(oXHTTP.Status = 200, True, False)
Exit Function
haveError:
URLExists = False
End Function
Then use the following code in the main function:
PrtFileName = "\\sharepointsite.com\path to folder"
PrtFileName2 = "sharepointsite.com/path to folder"
' ---------------------------------------
' Check source file exists using a loop
' to keep going back until a valid file
' is found within last 7 days.
' ---------------------------------------
Dim fileExists, a As Boolean
fileExists = False
Dim dateOffset As Integer
dateOffset = 0
Do While ((fileExists = False) And (dateOffset < 14))
FileDate = "Daily Report - Remedy v2 " + Format(Now() - dateOffset, "YYYY-MM-DD")
Filename1 = GetFullFileName(PrtFileName, FileDate)
MsgBox PrtFileName + "\" + Filename1
a = URLExists(PrtFileName2 + "/" + Filename1)
If a = True Then
'FileDate = Now()
Filename = PrtFileName + "\" + Filename1
MsgBox Filename
fileExists = True
Else
a = False
fileExists = False
dateOffset = dateOffset + 1
End If
Loop
Works like a charm. To open the excel workbook I use the following command:
Dim wb2 As Workbook
Set wb2 = Workbooks.Open(Filename)
If someone can make it more efficient then even better but it does the job for me for the moment.