PDF that renders in Chrome but not in Acrobat - pdf

%PDF-1.7
4 0 obj
<</Type/ObjStm/N 3/First 14/Length 139>>
stream
1 0 2 41 3 76 <</Type/Catalog/Version/1.7/Pages 2 0 R>><</Type/Pages/Kids[3 0 R]/Count 1>><</Type/Page/MediaBox[0 0 200 200]/Parent 2 0 R>>
endstream
endobj
5 0 obj
<<
/Root 1 0 R
/ID[<7F1FE2C507E6DB4CB0787E660F2B0C65><2450E4E8FF5FC84380428886C0DD4C2F>]
/Size 6
/Index[1 5]
/W[1 4 1]
/Type/XRef
/Length 68
/Filter[/ASCIIHexDecode]
>>
stream
020000000400
020000000401
020000000402
010000000A00
01000000E500
endstream
endobj
startxref
229
%%EOF
The PDF above opens in Chrome (or Edge), but in Adobe Acrobat (Reader) it crashes. Ghostscript regards it as fine too. Note that it assumes CRLF for line breaks.
I read the parts of the PDF spec that are relevant for a basic PDF, and it seems that the above syntax follows it. Why doesn't Adobe like it?
Here is a link to the PDF. Notice how it opens in Chrome, but crashes in Adobe Acrobat. (This PDF uses LF for line breaks, and has a Resources dictionary on the page, based on the comments.)

Acrobat has the following 2 quirks, both of which do not follow the specs:
If the XRef Stream has a single filter, an array must not be used. So /Filter[/FlateDecode] won't work, and /Filter/FlateDecode will. This may apply to any Stream Object, not sure.
An XRef Stream must use the FlateDecode filter. ASCIIHexDecode won't work. A predictor is not required.
Here is a link to the above PDF, fixed up for Acrobat.

Related

PostScript PDF (1.7), manually writing code

I'm trying to manually write a simple PDF file that contains a title, some text, and an image. I found one example of a manually written "Hello world" and managed to change some things, but I cant get it working for another text object. I have looked for help on the internet but with no luck, I guess not many people write their own PDF files.
This is what I have so far:
%PDF-1.7
1 0 obj % entry point
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<<
/Type /Pages
/MediaBox [ 0 0 200 200 ]
/Count 1
/Kids [ 3 0 R ]
>>
endobj
3 0 obj
<<
/Type /Page
/Parent 2 0 R
/Resources <<
/Font <<
/F1 4 0 R
>>
>>
/Contents 4 0 R
>>
endobj
4 0 obj % page content
<<
/Length 20
>>
stream
BT
80 180 TD
/F1 14 Tf
(PDF) Tj
ET
endstream
endobj
5 0 obj % page content
<<
/Length 20
>>
stream
BT
50 70 TD
/F1 14 Tf
(this is a pdf) Tj
ET
endstream
endobj
trailer
<<
/Size 6
/Root 1 0 R
>>
startxref
492
%%EOF
I have tried adding another text object with "this is a pdf" text but it wont show up, I don't know what could be wrong, I tried changing a few things but with no luck. The image part I don't have it either, so some help with that would be nice.
This is a wiki about the "hello world" pdf I found:
http://www.gnupdf.org/Introduction_to_PDF
Adobe offers some explanation on how the pdf works but I cant find anything that would fix my problem:
http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf
This is not a valid PDF. If Acrobat opens it at all it's because it's given up on the xref table and done a full scan of the file, but your PDF is invalid. 4 0 obj is not a font, as you specified, and 5 0 obj is not accessed from anywhere.
PDF specification requires an xref table which points to the exact position in the file for each object. You can't realistically write this by hand unless you intend to manually update the entire xref table every time you add or remove even 1 byte from the file.
You can write a PDF from scratch like this from code easily enough but it will not work to just open a PDF in notepad and start changing things because the index (xref) immediately becomes corrupt.
I'd also advise against putting comments throughout the file unless the comments start on new lines. Otherwise some PDF parsers will get confused as this is generally not expected. Usually PDF files do not contain comments (with the exception of the second line, which is recommended by Adobe to be a comment of some non-ASCII characters so FTP recognizes the file as binary) seeing as they are virtually impossible to write manually anyway.
http://www.adobe.com/devnet/pdf/pdf_reference.html
A few years ago, I wrote a book which covers exactly this sort of thing:
http://www.amazon.com/PDF-Explained-John-Whitington/dp/1449310028/
No free online version, I'm afraid. You can get all the same information from Adobe's own documentation, which is free, but it's a rather long document!

Attachment damages signature

I have PDF document.
1) Adobe reader reads document well.
2) I sign document (using pdfbox) and everything is well
3) I try to attach file to original pdf (Code is written in the pdfbox web page - in the cookBook).
4) Adobe reader reads attached document well. everything is well.
5) Now I have document with attachment.
6) I try to sign that document (I mean document with attachment). And I have 2 problem:
First:
when I open document, Adobe reader tells me that signature byte range is invalid.
Second:
when I try to close document (I mean to close adobe reader), Adobe reader tells me that:
Do you want to save changes to "original[with-attachment][signed]" before closing? I didn't find thy this happens.
here is testing files uploaded to the google doc
The cause of your issue is that the process of signing original[with-attachment].pdf creates an incremental update with a cross reference stream while the source file has a cross reference table. When adding incremental updates, the new cross references must be of the same type as the old ones.
It is quite possible that this error is due to the process attaching attach.txt misbehaving a bit, too: it stores the file as a PDF with a cross reference table even though the original was a file with a cross reference stream, but at the same time leaves some elements from the former cross reference dictionary in the trailer of the new file. These left-over elements (which do not belong in a trailer dictionary) probably make your signing process think the source already uses a cross reference stream.
As this change of cross reference style between incremental updates is forbidden, the Adobe Reader tries to fix the document in memory. Such attempts to fix often give rise to unexpected Do you want to save changes to "original[with-attachment][signed]" before closing? warnings.
In the course of fixing the PDF, the whole PDF is rearranged. This obviously causes that signature byte range is invalid.
original.pdf
%PDF-1.3
%âãÏÓ
11 0 obj
<</Linearized 1/L 48987/O 13/E 37674/N 3/T 48682/H [ 480 178]>>
endobj
25 0 obj
<</DecodeParms<</Columns 4/Predictor 12>>/Filter/FlateDecode/ID[<321A6D6DCD0785E8E35BD4B13115140A><59793561FB914D408936FC170763541A>]/Index[11 22]/Info 10 0 R/Length 77/Prev 48683/Root 12 0 R/Size 33/Type/XRef/W[1 2 1]>>stream
hÞbbd``b`jŒ â`–,õ#‚µÄb‰í±#Ä"Q{$¬rÄ‚MLŒ³€,F¬ÄÆK¿ Mi
endstream
endobj
startxref
0
%%EOF
32 0 obj
[.........]
endobj
8 0 obj
<</DecodeParms<</Columns 3/Predictor 12>>/Filter/FlateDecode/ID[<321A6D6DCD0785E8E35BD4B13115140A><59793561FB914D408936FC170763541A>]/Info 10 0 R/Length 50/Root 12 0 R/Size 11/Type/XRef/W[1 2 0]>>stream
hÞbb```bœ¬ÅÄÀ°“‰A\š‰H³Îbbà)²'ñ5&F§Û#yF€ xi
endstream
endobj
startxref
116
%%EOF
original[with-attachment].pdf
%PDF-1.3
%öäüß
1 0 obj
[.........]
endobj
xref
0 33
0000000000 65535 f
0000000015 00000 n
[...]
0000049667 00000 n
0000049737 00000 n
trailer
<<
/DecodeParms <<
/Columns 4
/Predictor 12
>>
/Filter /FlateDecode
/ID [<321A6D6DCD0785E8E35BD4B13115140A> <59793561FB914D408936FC170763541A>]
/Info 5 0 R
/Length 77
/Root 1 0 R
/Size 33
/Type /XRef
/W [1 2 1]
/Index [11 22]
>>
startxref
49755
%%EOF
original[with-attachment][signed].pdf
%PDF-1.3
%öäüß
1 0 obj
[....as above....]
startxref
49755
%%EOF
1 0 obj
[.........]
endobj
37 0 obj
<<
/ID [<DC60F4419C05967B81D7F64090027D7F> <DC60F4419C05967B81D7F64090027D7F>]
/Info 5 0 R
/Root 1 0 R
/Prev 49755
/Type /XRef
/Size 38
/Filter /FlateDecode
/Index [1 1 6 1 33 4]
/W [1 3 0]
/Length 31
>>
stream
xœcd8ú‘1&ˆ‘áØ.F†ã¾ŒŒ±ù#| VÚ
endstream
endobj
startxref
89569
%%EOF
A side remark
ID management: Your process adding attachments keeps the whole ID. Your signing process drops the whole original ID of the PDF and replaces it with a new one:
original.pdf
/ID[<321A6D6DCD0785E8E35BD4B13115140A><59793561FB914D408936FC170763541A>]
original[with-attachment].pdf
/ID [<321A6D6DCD0785E8E35BD4B13115140A> <59793561FB914D408936FC170763541A>]
original[signed].pdf
/ID [<A9F7159B1E5D8285A68475689B750214> <A9F7159B1E5D8285A68475689B750214>]
original[with-attachment][signed].pdf
/ID [<DC60F4419C05967B81D7F64090027D7F> <DC60F4419C05967B81D7F64090027D7F>]
Both approaches are wrong, processes manipulating a PDF and, therefore, creating a new version of it, shall keep the first ID entry and replace only the second one with a unique new one.

pdf, appearance streams, checkbox not shown correctly after focus lost

I'm working on a program generating interactive forms into PDF files.
The generated file is here (source is readable). The checkbox is on the bottom of the page. After it gets focus it is rendered correctly (white square with red/blue border), after it lose the focus the square disappears and the default appereance is shown (thats incorrect for me).
in Acrobat 9, X, XI
in build-in chrome pdf viewer it works fine
Adobe XI Pro - preflight - shows warning "Form field has multiple appearances"
I can not find the mistake.
Thanks for your help.
the same (similar) problem described there:
http://forums.adobe.com/message/5144579#5144579
---- here is a part of a pdf file I expect the mistake
2 0 obj
<<
/Type /Catalog
/Pages 1 0 R
/OutputIntents [7 0 R]
/Metadata 8 0 R
/PageLabels 10 0 R
/AcroForm 14 0 R
>>
endobj
14 0 obj
<<
/Fields [13 0 R]
>>
endobj
13 0 obj
<<
/Type /Annot
/Subtype /Widget
/Rect [20.0 20.0 120.0 120.0]
/FT /Btn
/F 4
/T (name)
/AS /Yes
/V /Yes
/AP <<
/N <<
/Yes 11 0 R
/Off 12 0 R >>
>>
>>
endobj
11 0 obj
<<
/Type /XObject
/SubType /Form
/BBox [20.0 20.0 120.0 120.0]
/Length 19 0 R
>>
stream
....
endstream
endobj
12 0 obj
<<
/Type /XObject
/SubType /Form
/BBox [20.0 20.0 120.0 120.0]
/Length 20 0 R
>>
stream
....
endstream
endobj
My observations with your PDF are somewhat different but interesting nonetheless:
Adobe Acrobat 9 Pro v9.5.4 (with PDF/A r/o view disabled) here does exactly what you originally seem to have expected: It only uses the red or blue framed box. If one toggled the check box, though, even if toggling back on again, it wants to save a new revision with some changes to your field.
Adobe Reader X! v11.0.2 starts in PDF/A read-only mode and displays the red frame. After leaving that r/o mode, though, it shows the default cross appearance. When it gets the focus it again uses the red and blued frames. When it loses focus, it goes back to the default appearances.
The behavior I observed in Adobe Reader X! seems to be what you observed in more cases.
Thus in essence the issue is that under certain circumstances (for me: not in PDF/A r/o mode, focus not on form field) some PDF vewers (for me: Adobe Reader XI) don't use your custom check box appearances but some standard ones, and you think that this is incorrect.
Unfortunately there is a hint in the PDF specification ISO 32000-1:2008 according to which viewers may (perhaps even shall) act just so. Table 189 in section 12.5.6.19 Widget Annotations explains the entries in an appearance characteristics dictionary (value of /MK in the widget dictionary; you do not provide one, thus defaults apply), among them /CA:
text string (Optional; button fields only) The widget annotation’s normal caption,
which shall be displayed when it is not interacting with the user.
Unlike the remaining entries listed in this Table, which apply only to
widget annotations associated with pushbutton fields (see Pushbuttons in
12.7.4.2, “Button Fields”), the CA entry may be used with any type of
button field, including check boxes (see Check Boxes in 12.7.4.2, “Button
Fields”) and radio buttons (Radio Buttons in 12.7.4.2, “Button Fields”).
In particular check boxes, therefore, whenever not interacting with the user, shall be displayed using their normal captions, not their appearances.
When there is no focus on a form field, Adobe Reader seems to think that the form is not interacting with the user, and therefore switches to display of caption instead of appearance.
Unfortunately the normal caption you can define for a button is but a text string which by default seems to be interpreted in the context of the Zapf Dingbats font (try /MK<</CA(1)>> for example). This is, though, where you should continue looking, maybe you can make it use some Type 3 font of your design containing a blue and a red square frame.

Explanation of part of a PDF file

This segment of a PDF file seems to cause Poppler to crash. Xpdf doesn't seem to choke on it. If I remove the /I1 Do and /I2 Do lines, the PDF file works fine. Can someone give me a quick explanation of those might be doing? Let me know if you need to see other parts of the PDF file.
1289 0 obj
<<
/Length 72
>>
stream
q
360.00 0 0 583.20 0 0 cm
/I1 Do
Q
q
360.00 0 0 583.20 0 0 cm
/I2 Do
Q
endstream
endobj
I1 and I2 are either images or form XObjects. Probably for some reason Poppler cannot decode their content and crashes. Even if I see the file I do not know the internals of Poppler so it is difficult to guess what it is causing the problem, unless it is an obvious error in the PDF structure.

Set PDF to print with no scaling

I am generating a PDF (using fpdf) and I am wondering if there is a way to set the document's properties to to default to print with no scaling.
So when you select print from the print dialogue menu, scaling is set to none. I'm trying to determine if this is a user setting or something I can control in the creation of the PDF.
Thanks in advance.
I've done it adding to the method _putcatalog() the following:
$this->_out('/ViewerPreferences [/PrintScaling/None]');
After the line:
$this->_out('/Type /Catalog');
Implementing a method is just fast and easy...
Print-scaling can be turned off for invividual PDF files using Adobe Acrobat, by going to File -> Preferences -> Advanced -> Page scaling. (You can try this using the trial version of Acrobat.)
As for achieving this in code, I've tried and failed to make it work, but the critical difference in the files seems to be:
10 0 obj
<</Metadata 2 0 R/Outlines 6 0 R/Pages 7 0 R/Type/Catalog/ViewerPreferences<</PrintScaling/None>>>>
endobj
for non-scaling PDFs, compared to
10 0 obj
<</Metadata 2 0 R/Outlines 6 0 R/Pages 7 0 R/Type/Catalog>>
endobj
for those that use the default shrink-to-fit option.
For me changing the FPDF Catalog method _putcatalog() and adding
$this->_out('/ViewerPreferences [/PrintScaling/None]');
wasn't accomplishing the goal so I looked at the code produced by a Acrobate XI PDF and found some more verbage. Adding the following code
$this->_out('/ViewerPreferences<</Duplex/Simplex/Enforce[/PrintScaling]/PrintScaling/None>>');
created a PDF that no longer defaulted to scaling and instead only gave the option to print Actual Size which was what was desired.
Scaling is controlled by the PDF application - it is not set in the file.
well i'm not sure if you mean somethink like this:
http://www.fpdf.org/en/doc/setdisplaymode.htm
or no "scaling" for images?
$im2 = pdf_open_image_file($dokument, 'jpeg', 'example.jpg');
pdf_place_image($dokument, $im2, 395, 655, 1.0); /* 1.0 = qualiti/scaling - 1.0 is original .../*
pdf_close_image($dokument, $im2);
I ran into the same problem.
I have several PDFs where the content of the PDF, that is text and images, go very near the PDFs border but still the print dialogue Preview/Acrobat suggests printing it in 100% scaling, thus cutting off the contents which aren't printable because of the printers natural margins.
Creating any PDF in Pages for example results in a PDF which is printed in 100% scaling by default.
However if I create a PDF using TCPDF which is related to FPDF than the printer dialog suggests to scale it in order to fit the page.
My suspicion is that the way the PDF is created is different. I suspect that Pages and other tools create separate layers and they are then handeled differently, possibly by a flag or something.
I compared the readable parts of my two PDF-Files and did come accross some differences, especially on how the documents begin. My knowledge of the PDF-Sources is, however very limited, so I can only guess what needs to change.
Is there a PDF-Reference where it is stated how to control the printable objects/areas?
Here the content of a minimal PDF which will be printed without scaling:
%PDF-1.4
1 0 obj
<< /Type /Catalog
/Outlines 2 0 R
/Pages 3 0 R
>>
endobj
2 0 obj
<< /Type /Outlines
/Count 0
>>
endobj
3 0 obj
<< /Type /Pages
/Kids [4 0 R]
/Count 1
>>
endobj
4 0 obj
<< /Type /Page
/Parent 3 0 R
/MediaBox [0 0 595 842]
/Contents 5 0 R
/Resources << /ProcSet 6 0 R
/Font << /F1 7 0 R >>
>>
>>
endobj
5 0 obj
<< /Length 73 >>
stream
BT
/F1 24 Tf
100 100 Td
(Hello World) Tj
ET
endstream
endobj
6 0 obj
[ /PDF /Text ]
endobj
7 0 obj
<< /Type /Font
/Subtype /Type1
/Name /F1
/BaseFont /Helvetica
/Encoding /MacRomanEncoding
>>
endobj
xref
0 8
0000000000 65535 f
0000000009 00000 n
0000000074 00000 n
0000000120 00000 n
0000000179 00000 n
0000000364 00000 n
0000000466 00000 n
0000000496 00000 n
trailer
<< /Size 8
/Root 1 0 R
>>
startxref
625
%%EOF
Ok, I think I got it.
Try this: open your TCPDF-created PDF and remove all occurenecs of viewerpreferences and any box-statements other than the MediaBox... doing so finally resulted in a print-default-scaling-free PDF :)
seams like those additional infos -intended for professional printing- only confuse the common pdf-viewer instead of helping with anything :)
Goto tcpdf.php and change line 8529 in method _putpages as follows
change
$boxes = array('MediaBox', 'CropBox', 'BleedBox', 'TrimBox', 'ArtBox');
into
$boxes = array('MediaBox');
In my PDF-output this instantly removed the scaling problem :)