iText PDF fails with message "Dictionary key endstream is not a name" - pdf

The issue is the same as reported here.
I have taken this image and converted to this PDF using GraphicsMagick v1.3.26 (build on 2017-07-04):
gm convert itext_banner_InvalidPdfException.jpg itext_banner_InvalidPdfException.pdf
When I try to read it with iText v5.5.12 I get the following exception:
java -cp itextpdf-5.5.12.jar com.itextpdf.text.pdf.parser.PdfContentReaderTool itext_banner_InvalidPdfException.pdf
com.itextpdf.text.exceptions.InvalidPdfException: Rebuild failed: Dictionary key endstream is not a name. at file pointer 1197; Original message: Dictionary key endstream is not a name. at file pointer 1197
at com.itextpdf.text.pdf.PdfReader.readPdf(PdfReader.java:764)
at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:197)
at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:235)
at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:223)
at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:213)
at com.itextpdf.text.pdf.parser.PdfContentReaderTool.listContentStream(PdfContentReaderTool.java:200)
at com.itextpdf.text.pdf.parser.PdfContentReaderTool.main(PdfContentReaderTool.java:249)
Questions:
What exactly is wrong with given PDF? It seems like there is an issue in GhostScript which is used indirectly by GraphicsMagick.
When I open it with iText RUPS v5.8.8, it does not print any warnings to Console tab. Does it mean it is valid from iText RUPS point of view?

Your PDF contains this broken object:
11 0 obj
<<
endstream
endobj
The opening << is closed by a endstream. This does not match.
If that object was meant to be a mere dictionary, it should have looked like this:
11 0 obj
<<
[a reasonable number of dictionary entries]
>>
endobj
If that object was meant to be a stream, it should have looked like this:
11 0 obj
<<
[a reasonable number of dictionary entries]
>>
stream
[stream data]
endstream
endobj
BTW, the object in question is not referenced from any other object in the PDF. If you open the PDF in a PdfReader in partial mode, therefore, the issue will be ignored.

Related

PDF Signature: "Expected a dict object"

I'm creating a library for digitally signing a PDF document. During my quest I stumbled upon an other problem.
In Acrobat I'm getting the error:
Error during signature verification.
Adobe Acrobat error.
Expected a dict object.
I know it expects a dictionary object somewhere. But I have no idea where.
This problem shows up when I add the image to the AP of the signature.
For this I'm basing my implementation on the spec, and " Insert multiple digital approval signatures without invalidating the previous one "
Most of this seems to work correctly, but when the image is present it results in the error. The image is correctly visible.
Current working:
(This is a very short overview of the part where the error is, it might be slightly different, but hope this helps)
I update the signature annotation. Add link to object that contains normal appearance.
16 0 obj
<<
/Type/Annot
/Subtype/Widget
...snip...
/AP<<
/N 21 0 R
>>
>>
Add image as XObject
20 0 obj
<<
/Type/XObject
/Subtype/Image
...snip...
/Length 29569
>>
stream
...snip...
endstream
endobj
Add XObject (Normal appearance)
21 0 obj
<<
/Type/XObject
/Subtype/Form
/Resources<<
/XObject<<
/UserSignature272 20 0 R
>>
>>
/BBox[0 0 135 37.5]
/Length 44
>>stream
q
135 0 0 37.5 0 0 cm
/UserSignature272 Do
Q
endstream
endobj
I think the problem happens somewhere in obj (21 0), but I'm not sure.
Here is a minimal file that can be used for testing.
https://drive.google.com/file/d/17sdz2xJy3VhN6i9YiuPrJ6x2s5kU2sra/view?usp=sharing
Any help, or hints would be welcome.
(This post is a continuation of PDF Digital Signature has "Bad parameter" in Acrobat, but is about a different problem, same subject area.)
You're running into a bug of Adobe Acrobat here: If you display a XObject from inside your signature appearance stream, it expects that XObject to have a Resources entry. This may make sense in case of form XObjects but it doesn't for image XObjects like in your case.
A work around is to add an empty Resources dictionary to your image XObject.
I checked this by replacing the /BBox[1 0 0 1 0 0] in your image XObject (which is not needed there anyways) by /Resources<< >>.
When Adobe Acrobat creates its own signature appearances, it creates a hierarchy of form XObjects here with Resource dictionaries all over including those for the "layers". I assume Adobe Reader, seeing the Do operator attempts to collect information on such "layers", not expecting to immediately be confronted with an image XObject.

PDF "Expected a Dict Object" on Adobe Reader But Perfect on Chrome

With the following pdf file I created by js.
http://www.dnpi.com.hk/errorfile.pdf
I am able to open with Chrome/Mobile perfectly.
But got Error message "Expected a Dict Object" on Adobe Acrobat DC.
Following with the image of the file openned with Chrome.
Anyone got any idea how this happens?
enter image description here
This PDF file produces errors in xpdf:
$ xpdf ~/git/david-cv/errorfile.pdf
Syntax Error (174): Dictionary key must be a name object
Syntax Error (176): Dictionary key must be a name object
Syntax Error (180): Dictionary key must be a name object
Syntax Error (182): Dictionary key must be a name object
Syntax Error (191): Dictionary key must be a name object
Syntax Error: Kid object (page 1) is wrong type (stream)
It seems to be objecting to the page 1 object.
Opening, this PDF in a text editor, there is a problem with the page 1 object:
3 0 obj
<</Type /Page
/Parent 1 0 R
/Resources 2 0 R
/MediaBox [0 0 595.28 841.89]
/Contents 4 0 R
endobj
It's missing a closing >> for the Page dictionary object.
The PDF file is incorrect, although Chrome/Mobile seems to be able to recover it.

What Encoding is Used in this PDF Metadata?

I'm looking at the binary of Adobe's PDF Reference document, and I'm wondering encoding is being used in the values of the metadata here:
<<
/Producer <30B9883671A1867F59929DEDF9AF32BC0029CF5414D3744A3273BCA8E7319382EA151980>
/Subject <30BE953B76E0A2306F8F8FFBFCA67E9D1D6A8F17418D200C1B6EEE88E726DAC4CE3E2CC1>
/Creator <37A89B34768D93347889CEAFBEF3>
/Title <219EBC7941A5943A6F9E80FAF5EF7E8D1A60881E04A630452968F38B>
/Author <30BE953B76E0A1266E8F8BF4E3E317B71166880A4B9135583865>
/ModDate <35E0C86923F1C36E2FC2DEA0A1F56BEF5F39C25D14D373>
/CreationDate <35E0C86923F1C36E2CCCDFAEA1F36EE128>
>>
So far, I can't find anything in the documentation or the ISO standard about this, and this is the only PDF I've seen so far with encoded metadata values.
Any ideas?
It is standard encoding but the text strings have been encrypted. See 3.5 Ecryption in that same reference guide.
When inspecting a PDF, you should always start with reading the trailer dictionary (see 3.4.4 File Trailer). In your document this contains an /Encrypt key:
<<
/Size 31667
/ID [<19574527ECBF00E3EC0373879833EEF6> <24EE9EDB7DE40DB862FDB4C5D3493585>]
/Info 7 0 R
/Root 1 0 R
/Encrypt 31666 0 R
>>
which is "required if document is encrypted".

PostScript PDF (1.7), manually writing code

I'm trying to manually write a simple PDF file that contains a title, some text, and an image. I found one example of a manually written "Hello world" and managed to change some things, but I cant get it working for another text object. I have looked for help on the internet but with no luck, I guess not many people write their own PDF files.
This is what I have so far:
%PDF-1.7
1 0 obj % entry point
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<<
/Type /Pages
/MediaBox [ 0 0 200 200 ]
/Count 1
/Kids [ 3 0 R ]
>>
endobj
3 0 obj
<<
/Type /Page
/Parent 2 0 R
/Resources <<
/Font <<
/F1 4 0 R
>>
>>
/Contents 4 0 R
>>
endobj
4 0 obj % page content
<<
/Length 20
>>
stream
BT
80 180 TD
/F1 14 Tf
(PDF) Tj
ET
endstream
endobj
5 0 obj % page content
<<
/Length 20
>>
stream
BT
50 70 TD
/F1 14 Tf
(this is a pdf) Tj
ET
endstream
endobj
trailer
<<
/Size 6
/Root 1 0 R
>>
startxref
492
%%EOF
I have tried adding another text object with "this is a pdf" text but it wont show up, I don't know what could be wrong, I tried changing a few things but with no luck. The image part I don't have it either, so some help with that would be nice.
This is a wiki about the "hello world" pdf I found:
http://www.gnupdf.org/Introduction_to_PDF
Adobe offers some explanation on how the pdf works but I cant find anything that would fix my problem:
http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf
This is not a valid PDF. If Acrobat opens it at all it's because it's given up on the xref table and done a full scan of the file, but your PDF is invalid. 4 0 obj is not a font, as you specified, and 5 0 obj is not accessed from anywhere.
PDF specification requires an xref table which points to the exact position in the file for each object. You can't realistically write this by hand unless you intend to manually update the entire xref table every time you add or remove even 1 byte from the file.
You can write a PDF from scratch like this from code easily enough but it will not work to just open a PDF in notepad and start changing things because the index (xref) immediately becomes corrupt.
I'd also advise against putting comments throughout the file unless the comments start on new lines. Otherwise some PDF parsers will get confused as this is generally not expected. Usually PDF files do not contain comments (with the exception of the second line, which is recommended by Adobe to be a comment of some non-ASCII characters so FTP recognizes the file as binary) seeing as they are virtually impossible to write manually anyway.
http://www.adobe.com/devnet/pdf/pdf_reference.html
A few years ago, I wrote a book which covers exactly this sort of thing:
http://www.amazon.com/PDF-Explained-John-Whitington/dp/1449310028/
No free online version, I'm afraid. You can get all the same information from Adobe's own documentation, which is free, but it's a rather long document!

confirmation dialog (alert) after form submit

I was wondering if there is any way to notify a user in adobe reader
that a pdf form has been submitted to the server? I am submitting a
normal http/html form to a php script no big deal, straight forward,
but there seems to be a big "black hole" in documentation, forums etc.
as to what happens when the form is submitted.
Isn't there a way to trigger a javascript alert after I have submitted
a form?? I dont't want to return another pdf that says "thank you",
that is a bit tacky. I am very new to pdf forms so I am guessing there
must be a way to return FDF to the original document that has some
javascript in it that is executed eg alert('thank you for your
feedback!')..
This should really be straight forward, I assumed Adobes much hyped
PDF technology was much more developer freindly and accessible..
Any ideas?? (Oh and please don't ask why I am using pdf forms and not the web, this is coming from "The Top", so as a developer I just have to do it..)
The server script which you are posting to must reply with this content type in the HTTP header:
'Content-Type: application/vnd.fdf'
eg. If you are using PHP:
header('Content-Type: application/vnd.fdf')
followed by the relevant bastardized-pdf-javascript-mutant-half-bread that will trigger the alert() dialog.
%FDF-1.2
1 0 obj
<<
/FDF
<<
/JavaScript
<<
/Doc 2 0 R
/After (confirmSend();)
>>
>>
>>
endobj
2 0 obj
[
(confirmSend) 3 0 R
]
endobj
3 0 obj
<<
>>
stream
function confirmSend()
{
app.alert({
cTitle : 'title of the window',
cMsg : 'message',
nIcon : 3
});
}
endstream
endobj
trailer
<<
/Root 1 0 R
>>
%%EOF
I hope you receive this message, as I wasted nearly 2 weeks of my life finding a solution...
Thanks for this! I too have been searching for a solution to this for hours! It was extremely frustrating. It seems like overkill to install the FDF Toolkit just to get a simple confirmation dialog box after the PDF has been submitted.
I eventually came up with the following through trial and error (it seems there is absolutely no documentation about this on the net):
%FDF-1.2
%âãÏÓ
1 0 obj
<<
/FDF
<<
/Status(Thank you. Your details have been submitted and someone will get in touch with regarding your application.)
>>
>>
endobj
trailer
<</Root 1 0 R>>
%%EOF
The above will present (or should present) a dialog box in Adobe Reader without showing the "Warning: JavaScript Window" warning.
Hope this ends up being useful to someone.
I wrangled with this for days, trying to figure out why when I sent the FDF using response.write, it just wouldn't display in Reader. I tried sending both hand-crafted FDF and installing the FDF toolkit to create the FDF response. I was able to create valid FDF, as I was able to open locally in Reader and have the pop-up display correctly but I couldn't get it to work for the life of me sending FDF from my ASP.NET page.
Then inspiration struck. In one of my attempts to send the FDF, I stored the FDF in a file and tried to use a streamreader to pump it into the response. After many unsuccessful attempts to use response.write, on a whim I tried response.redirect to the saved fdf file and it worked. I had previously added fdf as a registered MIME extension for my web site, with application/vnd.fdf as the MIME type. Now the user receives the pop-up after successful submission. The simple solution, in C#, looks like this:
Page.Response.Redirect("success.fdf");
I managed to happen upon the answer after 3 days of searching, adding a header for fdf file in the php script, adding '#FDF' to the end of the url in acrobat seems to have been the solution;
%FDF-1.2
1 0 obj
<<
/FDF
<<
/JavaScript
<<
/Doc 2 0 R
/After (confirmSend();)
>
>
>
endobj
2 0 obj
[
(confirmSend) 3 0 R
]
endobj
3 0 obj
<<
>
stream
function confirmSend()
{
app.alert("The form has been successfully submitted\nThank you for your feedback!", 3);
}
endstream
endobj
trailer
<<
/Root 1 0 R
>
%%EOF