Navigating to specific place in XPS document viewer VB.net - vb.net

I've spent 2 days trying to find the solution with no luck.
I have a XPS file which I am displaying in a document viewer, and what I want to do, is use a variable that I have, to navigate to a certain place in the document within the viewer.
Firstly, how would I set some sort of identifier within the xps document? At the moment, I have got bookmarks set in it from word and then converted to xps, which I can see in the FixedDoc file, therefore I know it can see them, but I don't know how to the utilise this.
At the moment I have:
Dim _XpsPackage As XpsDocument
_XpsPackage = New XpsDocument(xpsFilename, IO.FileAccess.Read)
docViewer.Document = _XpsPackage.GetFixedDocumentSequence
Dim _CurrentDocSection() As String = Split(_CurrentWindow.Title, ".", 2)
Dim docSeq As IXpsFixedDocumentSequenceReader = _XpsPackage.FixedDocumentSequenceReader
Dim doc As IXpsFixedDocumentReader = docSeq.FixedDocuments(0)
Dim a = doc.Uri
From here, I want to use _CurrentDocSection(1), as my identifier and then navigate to where that bookmark is within the document, but cannot seem to find it.
Thanks

Related

How to disable autocomplete in vba?

I'm creating the program for exporting several excel sheets to pdf with watermarks and form fields etc. Everything works smooth right now but the final pdf is quite large. So I was thinking about the best way to make it smaller and I found out that the best result is by simply opening the pdf in Adobe Acrobat and then print it with "Adobe PDF" printer. This way I reduce the file size to 1/6 of the original size.
So I'm trying to do this via the VBA code and it looks like it's prety straight forward code using the JS.
sPath = "some path"
sPathFinal = "some other path"
Dim AcroApp As AcroAVDoc: Set AcroApp = CreateObject("AcroExch.AVDoc")
Dim Document As AcroPDDoc
Dim JSO, pp
AcroApp.Open sPath, ""
Set Document = AcroApp.GetPDDoc()
Set JSO = Document.GetJSObject
Set pp = JSO.getPrintParams
pp.printerName = "Adobe PDF"
pp.Filename = sPathFinal
JSO.Print (pp)
The problem is in the very last line as it should be
JSO.print(pp) - "print" with lowercase "p"
But everytime I step away from the lane, it gets autocorrected to uppercase "P". I tried to turn everything off in Tools -> Options -> Editor -> Code settings as well as on other places in options tab but had no luck so far.
Is there a way to prevent this autocorrect?
(Also I'm not native english speaking so there is quite big chance that it is called differently :)
Short answare is no you can't, because VBA editor auto-correct the case in your code.
This is because VB is case-sensitive (despite it doesn't look like it is), and the editor tries to prevent typo by changing the case of your variables.
If you want to preserve the case and avoid auto-correct, the simplest solution is to use another editor (like Notepad) and compile your code from the command-line.
Hope this help.

Classic ASP - ASPPDF how to create a pdf document with multiple pages and 1 page template

its possible to create a new document with multiples pages - based in a PDF Template with ONE page only?
Its a member list - with name, ID and signature. I had a template for 1 page. but if the classroom have many members I can have a lots of pages..
Im trying, but I only create new docs with one page..
I dont know how I can create page 2, 3.. with the same template.. (of course, if its possible..)
I think this is the easiest way to make the docs..
tks!
Yes it's possile and you can do it easily. Have a look at Document Stitching.
Dim objPDF
Set objPDF = Server.CreateObject("Persits.Pdf")
Dim binTplDoc 'to hold template pdf document's binary content
With objPDF.OpenDocument(Server.MapPath("1-page-template.pdf"))
binTplDoc = .SaveToMemory
.Close
End With
Dim MultiPageDoc
Set MultiPageDoc = objPDF.CreateDocument 'a blank document
'add the same document several times
MultiPageDoc.AppendDocument objPDF.OpenDocumentBinary(binTplDoc)
'do something with MultiPageDoc.Pages(0)
MultiPageDoc.AppendDocument objPDF.OpenDocumentBinary(binTplDoc)
'do something with MultiPageDoc.Pages(1)
'..

Issue with pdf docs not showing up

We recently wrote some code for a client using the Aspose.pdf library, on my system the pdf in question opened fine and most of the merge fields were filled in (we don't have the exact list of merge fields that they do).
They're telling me that on their system, some documents take 2-4 mins to open while others don't open at all.
What could be a possible cause of the document not opening at all?
My code is below:
' Load form
Dim doc As Aspose.Pdf.Document = New Aspose.Pdf.Document(sTemplateDir & sDocName)
'Get names of form fields
Dim fields As Aspose.Pdf.InteractiveFeatures.Forms.Field() = doc.Form.Fields
Dim sField As String
Dim field As Aspose.Pdf.InteractiveFeatures.Forms.Field
If fields.Length > 0 Then
For Each field In fields
'Get name of field
sField = field.FullName
'If the merge field isn't valid then we'll just leave it and assume its a fill-in
If nMergeCol.Contains(sField) And Not IsNothing(sField) Then
field.Value = nMergeCol.Item(sField)
End If
Next
End If
This has been resolved! As we suspected, it was a problem with the client's Javascript within the pdf file. The problem was within the calculations the absolute value was being used (name.value). Once this was switched to the relative value (this.event.value) the pdf file began behaving correctly with the AsPose code.

Splitting MS Publisher 2010 document into multiple files

I want to split a multi-page MS Publisher 2010 document into a set of separate documents, one per page.
The starting document is from a mail-merge, and I am trying to produce a set of numbered and named tickets as PDFs to send to people for an event (this is for a charity). The mail-merge seems to work fine and I can save the merged document and it looks OK with e.g. a list of fifty people giving me a 50-page document.
Ideally the result would be a set of PDFs.
I have tried to create some simple VBA code to do this, but it is not working consistently. If I try this very simple macro below , I get the correct number of documents, but only perhaps 1 or 2 documents with the correct contents out of every five. Most of the documents are completely empty.
Sub splitter()
Dim i As Integer
Dim Source As Document
Dim Target As Document
Set Source = ActiveDocument
For i = 1 To Source.Pages.Count
Set Target = Documents.Add
Source.Pages(i).Shapes.Range.Copy
Target.Pages(1).Shapes.Paste
Target.SaveAs Filename:="C:\Temp\Ticket_" & i
Target.Close
Set Target = Nothing
Next i
End Sub
I did sometimes get an error that the clipboard is busy, but not always.
Another approach might be to start with the master document and do this looping over the separate documents and fill in the personal details for each person's ticket and directly produce the PDFs. But that seems more complex, and I am not a VB programmer (but been doing C++ etc for 20+ years, so I can program :-) )
A final annoyance is that it seems to keep opening a new Publisher window for each document. It takes a while to then close 50+ copies of publisher, and the laptop starts to crawl...
Please advise how best to get round these issues. I am probably missing something trivial, being a relative VB(A) newbie.
Thanks in advance for any suggestions
Try coding something like this:
Open Publisher application (CreateObject()?)
Open Publisher document (doc.Open(filename))
Store the total amount of pages in a global variable (doc.Pages.Count)
Close document (doc.Close())
Loop the following for each page
Copy the pub file and rename it to name & "page" & X
Open the new pub file
Remove all Pages except page X from the pub file
doc.Save()
doc.Close()
Copying files with VBA is easy, but copying pages in Publisher VBA is quite a hassle, so this should be easier to achieve

How to extract text from PDF file with IDENTITY-H fonts using VB.NET

I have a PDF file.
I am reading Text from PDF file pro-grammatically using iTextSharp class.
It does read Ansi Encoding Texts but It does not read IDENTITY-H Encoding Texts.
My problem is how to read IDENTITY-H texts from pdf file using VB.Net
Below is my code:
Public Function ReadPDFFile(ByVal strSource As String) As String
Dim sbPDFText As New StringBuilder() 'StringBuilder Object To Store read Text
If File.Exists(strSource) Then 'Does File Exist?
Dim pdfFileReader As New PdfReader(strSource) 'read File
For intCurrPage As Integer = 1 To pdfFileReader.NumberOfPages 'Loop Through All Pages
Dim lteStrategy As LocTextExtractionStrategy = New LocTextExtractionStrategy 'Read PDF File Content Blocks
'Get Text
Dim strCurrText As String = PdfTextExtractor.GetTextFromPage(pdfFileReader, intCurrPage, lteStrategy)
sbPDFText.Append(strCurrText) 'Add Text To String Builder
Next
pdfFileReader.Close() 'Close File
End If
Return sbPDFText.ToString() 'Return
End Function
Public Overridable Sub RenderText(ByVal renderInfo As TextRenderInfo) Implements ITextExtractionStrategy.RenderText
Dim segment As LineSegment = renderInfo.GetBaseline()
Dim location As New TextChunk(renderInfo.GetText(), segment.GetStartPoint(), segment.GetEndPoint(), renderInfo.GetSingleSpaceWidth())
If renderInfo.GetText = "" Then
Console.WriteLine(GetResultantText())
End If
With location
'Chunk Location:
Debug.Print(renderInfo.GetText)
.PosLeft = renderInfo.GetDescentLine.GetStartPoint(Vector.I1)
.PosRight = renderInfo.GetAscentLine.GetEndPoint(Vector.I1)
.PosBottom = renderInfo.GetDescentLine.GetStartPoint(Vector.I2)
.PosTop = renderInfo.GetAscentLine.GetEndPoint(Vector.I2)
'Chunk Font Size: (Height)
.curFontSize = .PosTop - segment.GetStartPoint()(Vector.I2)
'Use Font name and Size as Key in the SortedList
Dim StrKey As String = renderInfo.GetFont.PostscriptFontName & .curFontSize.ToString
'Add this font to ThisPdfDocFonts SortedList if it's not already present
If 1 = 1 Then
If Not ThisPdfDocFonts.ContainsKey(StrKey) Then ThisPdfDocFonts.Add(StrKey, renderInfo.GetFont)
'Store the SortedList index in this Chunk, so we can get it later
.FontIndex = ThisPdfDocFonts.IndexOfKey(StrKey)
Console.WriteLine(renderInfo.GetFont.ToString & "-->" & StrKey)
Else
'pcbContent.SetFontAndSize(BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED), 9)
.FontIndex = 3
.curFontSize = 8
End If
End With
locationalResult.Add(location)
End Sub
Thank you for sharing the PDF document. It helped us to determine that the problem you describe is not an iTextSharp problem. Instead it is a problem with the PDF document itself.
This problem doesn't have a solution, but I'm providing this answer to explain how you can discover for yourself that the problem also exists when iTextSharp isn't involved.
Open the document in Adobe Reader. Select the text "Muy señores nuestros" and copy/paste it into a text editor. You get "Muy señores nuestros". This is text that can be extracted using iTextSharp (it works correctly).
Now do the same with the text "GUARDIAN GLASS EXPRESS, S.L.". You get the following result: "􀀪􀀸􀀤􀀵􀀧􀀬􀀤􀀱􀀃􀀪􀀯􀀤􀀶􀀶􀀃􀀨􀀻􀀳􀀵􀀨􀀶􀀶􀀏􀀃􀀶􀀑􀀯􀀑". As you can see, you can not copy/paste the text correctly from Adobe Reader. This is due to the way the text is stored in the PDF. If you can not copy/paste the text from Adobe Reader, you should not expect to be able to extract the text using iTextSharp. The PDF is created in a way that doesn't allow extraction.
Please take a look at this video to find out some possible causes: https://www.youtube.com/watch?v=wxGEEv7ibHE
I'm sorry that it took so long to figure this out and that it turns out that you're asking something that isn't possible. Your question narrowed the problem down too much, as if the problem was caused by the "IDENTITY-H" encoding and iTextSharp. In reality, you're trying to extract text that can't be extracted.
If you look at the page dictionary inside the PDF, you'll find three font resources for the first (and only) page:
In the content stream (below) small red arrow, you see two strings (hexadecimal notation) that are shown using fonts referenced using the names C2_0 and C2_1. Incidentally, these fonts are stored as composite fonts with /SubType 0 and /Encoding Identity-H. This means that the characters used in the hexadecimal string should correspond with the UNICODE values of the glyphs. If that's not the case, you're out of luck.
There seems to be no problem with the font for which the name /TT0 is used.
The fact that /TT0 uses WinAnsiEncoding and the other fonts use Identity-H is irrelevant. There are plenty of PDF files with fonts that use Identity-H of which the text can be copy/pasted or extracted using iTextSharp. Unfortunately, there is probably something wrong with the way your PDF was constructed. It would take too much time to analyze what went wrong, so your best shot is to contact the person who gave you the PDF and to ask him/her to fix the PDF.