Java Print encoding with Sun PDFRenderer - pdf

I'm a beginner in Java programming and also here at stackoverflow. Currently I'm trying to print PDF-Files with the com.sun.pdfview library. It works very often, but with some documents I get the following Error:
java.lang.IllegalArgumentException: Unknown encoding: SymbolSetEncoding
at com.sun.pdfview.font.PDFFontEncoding.getBaseEncoding(PDFFontEncoding.java:199)
at com.sun.pdfview.font.PDFFontEncoding.<init>(PDFFontEncoding.java:78)
at com.sun.pdfview.font.PDFFont.getFont(PDFFont.java:133)
at com.sun.pdfview.PDFParser.getFontFrom(PDFParser.java:1166)
at com.sun.pdfview.PDFParser.iterate(PDFParser.java:719)
at com.sun.pdfview.BaseWatchable.run(BaseWatchable.java:101)
at java.lang.Thread.run(Thread.java:722)
I should inform you, that these documents are written in a caucasian language (georgian) and the typical font is Sylfaen.
the error occurs in the following code:
PDFRenderer pgs = new PDFRenderer(page, g2, imgbounds, null,null);
try {
page.waitForFinish();
pgs.run();
I believe that these documents need to use a different encoding or I need to specify the font, unfortunately I couldn't find an ankle where I can take a look or change setting.
Thank you
Martin

PDFRenderer only supports a limited subset of the PDF spec.

Related

syncfusion.pdf.pdfException"Could Not Find valid signature (%pds-).'

string docuAddr = #"C:\Users\psimmon\source\repos\PDFTESTAPP\PDFTESTAPP\TempForms\forms-www.courts.state.co.us-Forms-PDF-JDF1117.pdf";
byte[] bytes = Encoding.Unicode.GetBytes(docuAddr);
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(bytes, true);---blows here
PdfLoadedForm myForm = loadedDocument.Form;
PdfLoadedFormFieldCollection fields = myForm.Fields;
not sure what I have done wrong here, but the PDF file is opening, either in a browser or a fileexployer window. so it has to be me, guessed at most of this, all you very smart folks, I could use your gray matter. forgive my stupidity.
The reported exception “could not find valid signature (%PDF-)” may occurs due to the file is not a PDF document. We suspect it seems the other format files are saved with the “.pdf” extension. We could not open and repair this type of document on our end, we have already added the details in our documentation,
Please find some of the following corrupted error messages that cannot be repaired:
UG: https://help.syncfusion.com/file-formats/pdf/open-and-save-pdf-file-in-c-sharp-vb-net#possible-error-messages-of-invalid-pdf-documents-while-loading
If you want to find this type of corrupted document, Syncfusion PDF Library provides support to check and report whether the existing PDF document is corrupted or not with corruption details and structure-level syntax errors.
UG: https://help.syncfusion.com/file-formats/pdf/working-with-document#find-corrupted-pdf-document
Blog: https://www.syncfusion.com/blogs/post/how-to-find-corrupted-pdf-files-in-c-sharp.aspx
KB: https://www.syncfusion.com/kb/9686/how-to-identify-the-corrupted-pdf-document-using-c-and-vb-net

Save an image present in PDF on local File System

This is my first experience of using PDFBox jar files. Also, I have recently started working on TestComplete. In short, all these things are new for me and I have been stuck on one issue for last few hours. I will try to explain as much as I can. Would really appreciate any help!
Objective:
To save an image present in a PDF file on the file system
Issue:
When this line gets executed objImage.write2file_2(strSavePath);, I get the error Object doesn't support this property or method.
I am taking some help from here
Code:
function fn_PDFImage()
{
var objPdfFile, strPdfFilePath, strSavePath, objPages, objPage, objImages, objImage, imgbuffer;
strPdfFilePath = "C:\\Users\\aabb\\Desktop\\name.pdf";
strSavePath = "C:\\Users\\aabb\\Desktop\\abc";
objPdfFile = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3(strPdfFilePath);
objPages = objPdfFile.getDocumentCatalog().getAllPages();
//getting a page with index=1
objPage = objPages.get(1)
objImages = objPage.getResources().getXObjects().values().toArray();
Log.Message(objImages.length); //This is returning 14. i.e, 14 images
//getting an image with index=1
objImage = objImages.items(1);
Log.Message(typeof objImage); //returns "Object" which means it is not null
//saving the image
objImage.write2file_2(strSavePath); //<---GETTING AN ERROR HERE
}
ERROR:
If you are bothered about the method namewrite2file_2, please read this excerpt from the link which I have shared:
In Java, the constructor of a class has the name of this class.
TestComplete changes the constructor names to newInstance(). If a
class has overloaded constructors, TestComplete names them like
newInstance, newInstace_2, newInstance_3 and so on.
Additional Info:
I have imported Jar file(pdfbox-app-1.8.13.jar) and their classes in testcomplete. I am not sure if I need to import some other jar file or its class here:
XObjects are not always image XObjects. And write2file is in the class PDXObjectImage so you need to check your object type first.
Re the second question asked in the comment: the form XObject isn't something you can save. XObject forms are content streams with resources etc, similar to pages. However what you can do is to explore these too whether the resources have images. See how this is done in the ExtractImages source code of PDFBox 1.8.
However there are other places where there can be images (e.g. patterns, soft masks, inline images); this is only available in PDFBox 2.*, see the ExtractImages source code there. (Note that the class names are different).

how to use very old iText(under 0.99) to create bookmarks / outlines?

may I know how to use old iText(very old version under 0.99, package path = com.lowagie.xxx) to create bookmarks to jump in the internal pdf pls?
like the api in new iText jar:
PdfOutline outoline2 = com.itextpdf.pdf.PdfAction.gotoLocalPage("destinationName", false)
we have found below code to create bookmark, but find old iText needs to use the filename(see outFileName in below code). but what we want is a jump in internal pdf (not remote pdf)
olineSignature = new PdfOutline(root, new PdfAction(outFileName, "Signature2TxtDestination"), "Signature2TxtOutline");
FYI, we don't know what page number in advance, so no way to use the api as below: old PdfAction.gotoLocalPage(int, PdfDestination, PdfWriter)
anybody can help me? Thanks.#Bruno Lowagie, #itext :)
We are in the progress of upgrading to new iText(itext5+), but now we do get a request to create bookmarks(using old iText) for others to retrieve the created bookmarks.
My memory can't go that far back but local destinations are most probably not supported. Your only chance is to do an interim upgrade to the Jurassic 2.1.7 that should be more or less compatible with that Pleistocene 0.99.

How to get author of a pdf document with mupdf

how can I get metadata of a pdf document(e.g. title, author, creation date etc) by using mupdf library? There is not enough documentation to find out this functionality. Comments are not sufficient, too. Most probably, there is a functionality for this purpose but it is hard to find under these circumstances. The following code is what I have so far.
char info[64];
globals *glo = get_globals(env, thiz);
fz_meta(glo->doc, FZ_META_INFO, info, sizeof(info));
I have used FZ_META_INFO tag, but it doesn't work. I didn't get any info, just empty. I have checked that it has metadata. Any help is appreciated.
EDIT:
Target Android sdk:20
Min Android sdk:15
Mupdf version: 1.6
ndk: r10c
Development OS: Ubuntu 12.04
In what sense 'doesn't work' ? Throws an error ? Crashes ? Are you certain the PDF file you are using has any 'Info' metadata ?
What is the version of MuPDF ? What platform are you using ?
You need to set the relevant key in the buffer you pass to fz_meta before you call fz_mets, I notice you aren't doing that.
See win_main.c at around line 487, after you get past the macro this resolves to
char info[256]
sprintf(info, "Title");
fz_meta(doc, FZ_META_INFO, info, 256);
On return 'info' will contain the metadata associated with the Title key in the dictionary.
When in doubt, build the sample app and follow it in a debugger......
If the proper casting allow to send the key,
this casting is NOT correct to receive back a char*.
Exemple;
Proper casting to send a request
char buff[2048];
strcpy(buff,"CreationDate")
if (fz_meta(ctx,doc,FZ_META_INFO,&buff,2048)) {
buff[0] = 0;
}
Will:
find the key,
convert utf8
then will crash when copyback of the result
Proper casting to receive a request
char buff[2048];
strcpy(buff,"CreationDate")
if (fz_meta(ctx,doc,FZ_META_INFO,buff,2048)) {
buff[0] = 0;
}
Will crash during dict scanning.
looks really like a bug!
I confirm that modifying original source
info = pdf_dict_gets(ctx, info, (char *)ptr);
is the way to go. (even if strange that nobody else find it while writing code, because Meta are useful features frequently used

itext outofmemory error while attempting to count the number of pages in a pdf file

I'm trying to execute the following code:
PdfReader reader = new PdfReader("/path/to/file.pdf");
int pages = reader.getNumberOfPages();
It works on most files, but on one particular file, it crashes with error:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
at java.lang.StringBuffer.append(StringBuffer.java:320)
at com.itextpdf.text.pdf.PRTokeniser.readString(PRTokeniser.java:158)
at com.itextpdf.text.pdf.PRTokeniser.getStartxref(PRTokeniser.java:224)
at com.itextpdf.text.pdf.PRTokeniser.getStartxref(PRTokeniser.java:229)
...goes on for a while
at com.itextpdf.text.pdf.PRTokeniser.getStartxref(PRTokeniser.java:229)
I know that it's something wrong with the input file. I'm just wondering if there's a way of knowing before attempting to make the method call, that the file is going to cause a problem.
It turns out it was a bug with the version of itext I am using (5.0.1). I logged a query with the developers, and a fix was put in - that I tested - and which hopefully will find it's way into the next version (5.0.2)