I'm trying to manually write a simple PDF file that contains a title, some text, and an image. I found one example of a manually written "Hello world" and managed to change some things, but I cant get it working for another text object. I have looked for help on the internet but with no luck, I guess not many people write their own PDF files.
This is what I have so far:
%PDF-1.7
1 0 obj % entry point
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<<
/Type /Pages
/MediaBox [ 0 0 200 200 ]
/Count 1
/Kids [ 3 0 R ]
>>
endobj
3 0 obj
<<
/Type /Page
/Parent 2 0 R
/Resources <<
/Font <<
/F1 4 0 R
>>
>>
/Contents 4 0 R
>>
endobj
4 0 obj % page content
<<
/Length 20
>>
stream
BT
80 180 TD
/F1 14 Tf
(PDF) Tj
ET
endstream
endobj
5 0 obj % page content
<<
/Length 20
>>
stream
BT
50 70 TD
/F1 14 Tf
(this is a pdf) Tj
ET
endstream
endobj
trailer
<<
/Size 6
/Root 1 0 R
>>
startxref
492
%%EOF
I have tried adding another text object with "this is a pdf" text but it wont show up, I don't know what could be wrong, I tried changing a few things but with no luck. The image part I don't have it either, so some help with that would be nice.
This is a wiki about the "hello world" pdf I found:
http://www.gnupdf.org/Introduction_to_PDF
Adobe offers some explanation on how the pdf works but I cant find anything that would fix my problem:
http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf
This is not a valid PDF. If Acrobat opens it at all it's because it's given up on the xref table and done a full scan of the file, but your PDF is invalid. 4 0 obj is not a font, as you specified, and 5 0 obj is not accessed from anywhere.
PDF specification requires an xref table which points to the exact position in the file for each object. You can't realistically write this by hand unless you intend to manually update the entire xref table every time you add or remove even 1 byte from the file.
You can write a PDF from scratch like this from code easily enough but it will not work to just open a PDF in notepad and start changing things because the index (xref) immediately becomes corrupt.
I'd also advise against putting comments throughout the file unless the comments start on new lines. Otherwise some PDF parsers will get confused as this is generally not expected. Usually PDF files do not contain comments (with the exception of the second line, which is recommended by Adobe to be a comment of some non-ASCII characters so FTP recognizes the file as binary) seeing as they are virtually impossible to write manually anyway.
http://www.adobe.com/devnet/pdf/pdf_reference.html
A few years ago, I wrote a book which covers exactly this sort of thing:
http://www.amazon.com/PDF-Explained-John-Whitington/dp/1449310028/
No free online version, I'm afraid. You can get all the same information from Adobe's own documentation, which is free, but it's a rather long document!
Related
%PDF-1.7
4 0 obj
<</Type/ObjStm/N 3/First 14/Length 139>>
stream
1 0 2 41 3 76 <</Type/Catalog/Version/1.7/Pages 2 0 R>><</Type/Pages/Kids[3 0 R]/Count 1>><</Type/Page/MediaBox[0 0 200 200]/Parent 2 0 R>>
endstream
endobj
5 0 obj
<<
/Root 1 0 R
/ID[<7F1FE2C507E6DB4CB0787E660F2B0C65><2450E4E8FF5FC84380428886C0DD4C2F>]
/Size 6
/Index[1 5]
/W[1 4 1]
/Type/XRef
/Length 68
/Filter[/ASCIIHexDecode]
>>
stream
020000000400
020000000401
020000000402
010000000A00
01000000E500
endstream
endobj
startxref
229
%%EOF
The PDF above opens in Chrome (or Edge), but in Adobe Acrobat (Reader) it crashes. Ghostscript regards it as fine too. Note that it assumes CRLF for line breaks.
I read the parts of the PDF spec that are relevant for a basic PDF, and it seems that the above syntax follows it. Why doesn't Adobe like it?
Here is a link to the PDF. Notice how it opens in Chrome, but crashes in Adobe Acrobat. (This PDF uses LF for line breaks, and has a Resources dictionary on the page, based on the comments.)
Acrobat has the following 2 quirks, both of which do not follow the specs:
If the XRef Stream has a single filter, an array must not be used. So /Filter[/FlateDecode] won't work, and /Filter/FlateDecode will. This may apply to any Stream Object, not sure.
An XRef Stream must use the FlateDecode filter. ASCIIHexDecode won't work. A predictor is not required.
Here is a link to the above PDF, fixed up for Acrobat.
I'm writing a PDF generator for an embedded system and i can't seem to get it to render. Here's what i have so far http://pastebin.com/20DQiGfb.
Neither ghostscript, nor RUPS is showing any syntax errors, and every PDF viewer i checked this on displays blank pages. I'v verified the syntax, offsets, structure to be fine. I just don't get it.
Maybe someone can suggest another tool i can try ? Maybe one that would let me inspect the visual elements ?
EDIT: adding the full file for download https://drive.google.com/file/d/0B8ljtiq-2gIJcTBFR2tjRGxVOFk/view?usp=sharing
EDIT2: Acrobat/Preflight also found no errors
These are your Page dictionaries:
7 0 obj
<< /Type /Page
/Parent 9 0 R
/Resources 2 0 R
/Content 3 0 R
/MediaBox [0 0 612 792]
>>
endobj
8 0 obj
<< /Type /Page
/Parent 9 0 R
/Resources 2 0 R
/Content 5 0 R
/MediaBox [0 0 612 792]
>>
endobj
But the correct key for the content stream is Contents!
Contents stream or array (Optional) A content stream (see 7.8.2, "Content Streams") that shall describe the contents of this page. If this entry is absent, the page shall be empty.
(Table 30 – Entries in a page object - ISO 32000-1)
After adding the missing s characters I see:
and
I have PDF document.
1) Adobe reader reads document well.
2) I sign document (using pdfbox) and everything is well
3) I try to attach file to original pdf (Code is written in the pdfbox web page - in the cookBook).
4) Adobe reader reads attached document well. everything is well.
5) Now I have document with attachment.
6) I try to sign that document (I mean document with attachment). And I have 2 problem:
First:
when I open document, Adobe reader tells me that signature byte range is invalid.
Second:
when I try to close document (I mean to close adobe reader), Adobe reader tells me that:
Do you want to save changes to "original[with-attachment][signed]" before closing? I didn't find thy this happens.
here is testing files uploaded to the google doc
The cause of your issue is that the process of signing original[with-attachment].pdf creates an incremental update with a cross reference stream while the source file has a cross reference table. When adding incremental updates, the new cross references must be of the same type as the old ones.
It is quite possible that this error is due to the process attaching attach.txt misbehaving a bit, too: it stores the file as a PDF with a cross reference table even though the original was a file with a cross reference stream, but at the same time leaves some elements from the former cross reference dictionary in the trailer of the new file. These left-over elements (which do not belong in a trailer dictionary) probably make your signing process think the source already uses a cross reference stream.
As this change of cross reference style between incremental updates is forbidden, the Adobe Reader tries to fix the document in memory. Such attempts to fix often give rise to unexpected Do you want to save changes to "original[with-attachment][signed]" before closing? warnings.
In the course of fixing the PDF, the whole PDF is rearranged. This obviously causes that signature byte range is invalid.
original.pdf
%PDF-1.3
%âãÏÓ
11 0 obj
<</Linearized 1/L 48987/O 13/E 37674/N 3/T 48682/H [ 480 178]>>
endobj
25 0 obj
<</DecodeParms<</Columns 4/Predictor 12>>/Filter/FlateDecode/ID[<321A6D6DCD0785E8E35BD4B13115140A><59793561FB914D408936FC170763541A>]/Index[11 22]/Info 10 0 R/Length 77/Prev 48683/Root 12 0 R/Size 33/Type/XRef/W[1 2 1]>>stream
hÞbbd``b`jŒ â`–,õ#‚µÄb‰í±#Ä"Q{$¬rÄ‚MLŒ³€,F¬ÄÆK¿ Mi
endstream
endobj
startxref
0
%%EOF
32 0 obj
[.........]
endobj
8 0 obj
<</DecodeParms<</Columns 3/Predictor 12>>/Filter/FlateDecode/ID[<321A6D6DCD0785E8E35BD4B13115140A><59793561FB914D408936FC170763541A>]/Info 10 0 R/Length 50/Root 12 0 R/Size 11/Type/XRef/W[1 2 0]>>stream
hÞbb```bœ¬ÅÄÀ°“‰A\š‰H³Îbbà)²'ñ5&F§Û#yF€ xi
endstream
endobj
startxref
116
%%EOF
original[with-attachment].pdf
%PDF-1.3
%öäüß
1 0 obj
[.........]
endobj
xref
0 33
0000000000 65535 f
0000000015 00000 n
[...]
0000049667 00000 n
0000049737 00000 n
trailer
<<
/DecodeParms <<
/Columns 4
/Predictor 12
>>
/Filter /FlateDecode
/ID [<321A6D6DCD0785E8E35BD4B13115140A> <59793561FB914D408936FC170763541A>]
/Info 5 0 R
/Length 77
/Root 1 0 R
/Size 33
/Type /XRef
/W [1 2 1]
/Index [11 22]
>>
startxref
49755
%%EOF
original[with-attachment][signed].pdf
%PDF-1.3
%öäüß
1 0 obj
[....as above....]
startxref
49755
%%EOF
1 0 obj
[.........]
endobj
37 0 obj
<<
/ID [<DC60F4419C05967B81D7F64090027D7F> <DC60F4419C05967B81D7F64090027D7F>]
/Info 5 0 R
/Root 1 0 R
/Prev 49755
/Type /XRef
/Size 38
/Filter /FlateDecode
/Index [1 1 6 1 33 4]
/W [1 3 0]
/Length 31
>>
stream
xœcd8ú‘1&ˆ‘áØ.F†ã¾ŒŒ±ù#| VÚ
endstream
endobj
startxref
89569
%%EOF
A side remark
ID management: Your process adding attachments keeps the whole ID. Your signing process drops the whole original ID of the PDF and replaces it with a new one:
original.pdf
/ID[<321A6D6DCD0785E8E35BD4B13115140A><59793561FB914D408936FC170763541A>]
original[with-attachment].pdf
/ID [<321A6D6DCD0785E8E35BD4B13115140A> <59793561FB914D408936FC170763541A>]
original[signed].pdf
/ID [<A9F7159B1E5D8285A68475689B750214> <A9F7159B1E5D8285A68475689B750214>]
original[with-attachment][signed].pdf
/ID [<DC60F4419C05967B81D7F64090027D7F> <DC60F4419C05967B81D7F64090027D7F>]
Both approaches are wrong, processes manipulating a PDF and, therefore, creating a new version of it, shall keep the first ID entry and replace only the second one with a unique new one.
I've build pdf with pdfbox and by the hand. I have also Visible signature on pdf. everything works, but there is no image and text shown in PDF (but there is visible rectangle, without image and text). what do you think what happens?
can you see the sample?
that's sample
thank you.
I've build pdf with pdfbox and by the hand. [...] there is no image and text shown in PDF (but there is visible rectangle, without image and text).
That is exactly what you built your document and especially the signature related data to do:
3 0 obj
<<
/FT /Sig
/F 132
/T (Signature1)
/Type /Annot
/Subtype /Widget
/V 5 0 R
/P 4 0 R
/Rect [100 574 310 625]
/AP << /N 6 0 R >>
/DR << /XObject << /FRM0 7 0 R >> >>
>>
endobj
6 0 obj
<<
/Type /XObject
/Subtype /Form
/Resources << /XObject << /FRM0 7 0 R >> >>
/BBox [0 0 100 100]
/FormType 1
/Length 8 0 R
>>
stream
endstream
endobj
There is a visible rectangle (actually after selecting the signature in question) because /Rect [100 574 310 625] in your signature field dictionary indicates the rectangular area where you have your signature.
There is no image and text shown in PDF because the normal appearance stream (which according to /AP << /N 6 0 R >> in your signature field dictionary is defined in object 6) is defined as an empty stream (there is nothing but white space between stream and endstream).
Most likely you wanted to place the xobject /FRM0 defined in the resources of the appearance stream. In that case you have the same problem in that xobject:
7 0 obj
<<
/Type /XObject
/Subtype /Form
/Resources << /XObject << /n0 9 0 R /n1 10 0 R >> >>
/BBox [0 0 100 100]
/FormType 1
/Length 11 0 R
>>
stream
endstream
endobj
This stream also is empty, you forgot to place the xobjects /n0 and /n1.
Those xobjects look correctly defined but seem to be copied from samples from the early age of integrated PDF signatures.
Concerning the Adobe Acrobat error message observed by #stanlyF:
Error during signature verification.
Signature contains incorrect, unrecognized, corrupted or suspicious data.
Support Information: SigDict /SubFilter value
The signature value dictionary also is incomplete:
5 0 obj
<<
/Type /Sig
/Name (sig1)
/ByteRange [0 0 0 0]
/Contents <0000...0000>
>>
endobj
The dictionary neither has a /Filter nor a /SubFilter entry. While nominally the filter is required and the subfilter is optional, interoperable signing mostly depends on the subfilter and the filter ist ignored. Thus the Support Information.
The /Name entry is weird because it is specified to contain the name of the person or authority signing the document (if present)
The signed byte range is empty: it consists of two seqgments, both of them starting at offset 0 and being 0 bytes long.
The contained signature container itself consists only of 0x00 bytes.
Acrobat said:
"Error during signature verification.
Signature contains incorrect, unrecognized, corrupted or suspicious data.
Support Information: SigDict /SubFilter value"
Signature has incorrect/incomplete the content-closing marker.
And also /n0 /n1 XObjects in resources have no pdf instructions.
I am generating a PDF (using fpdf) and I am wondering if there is a way to set the document's properties to to default to print with no scaling.
So when you select print from the print dialogue menu, scaling is set to none. I'm trying to determine if this is a user setting or something I can control in the creation of the PDF.
Thanks in advance.
I've done it adding to the method _putcatalog() the following:
$this->_out('/ViewerPreferences [/PrintScaling/None]');
After the line:
$this->_out('/Type /Catalog');
Implementing a method is just fast and easy...
Print-scaling can be turned off for invividual PDF files using Adobe Acrobat, by going to File -> Preferences -> Advanced -> Page scaling. (You can try this using the trial version of Acrobat.)
As for achieving this in code, I've tried and failed to make it work, but the critical difference in the files seems to be:
10 0 obj
<</Metadata 2 0 R/Outlines 6 0 R/Pages 7 0 R/Type/Catalog/ViewerPreferences<</PrintScaling/None>>>>
endobj
for non-scaling PDFs, compared to
10 0 obj
<</Metadata 2 0 R/Outlines 6 0 R/Pages 7 0 R/Type/Catalog>>
endobj
for those that use the default shrink-to-fit option.
For me changing the FPDF Catalog method _putcatalog() and adding
$this->_out('/ViewerPreferences [/PrintScaling/None]');
wasn't accomplishing the goal so I looked at the code produced by a Acrobate XI PDF and found some more verbage. Adding the following code
$this->_out('/ViewerPreferences<</Duplex/Simplex/Enforce[/PrintScaling]/PrintScaling/None>>');
created a PDF that no longer defaulted to scaling and instead only gave the option to print Actual Size which was what was desired.
Scaling is controlled by the PDF application - it is not set in the file.
well i'm not sure if you mean somethink like this:
http://www.fpdf.org/en/doc/setdisplaymode.htm
or no "scaling" for images?
$im2 = pdf_open_image_file($dokument, 'jpeg', 'example.jpg');
pdf_place_image($dokument, $im2, 395, 655, 1.0); /* 1.0 = qualiti/scaling - 1.0 is original .../*
pdf_close_image($dokument, $im2);
I ran into the same problem.
I have several PDFs where the content of the PDF, that is text and images, go very near the PDFs border but still the print dialogue Preview/Acrobat suggests printing it in 100% scaling, thus cutting off the contents which aren't printable because of the printers natural margins.
Creating any PDF in Pages for example results in a PDF which is printed in 100% scaling by default.
However if I create a PDF using TCPDF which is related to FPDF than the printer dialog suggests to scale it in order to fit the page.
My suspicion is that the way the PDF is created is different. I suspect that Pages and other tools create separate layers and they are then handeled differently, possibly by a flag or something.
I compared the readable parts of my two PDF-Files and did come accross some differences, especially on how the documents begin. My knowledge of the PDF-Sources is, however very limited, so I can only guess what needs to change.
Is there a PDF-Reference where it is stated how to control the printable objects/areas?
Here the content of a minimal PDF which will be printed without scaling:
%PDF-1.4
1 0 obj
<< /Type /Catalog
/Outlines 2 0 R
/Pages 3 0 R
>>
endobj
2 0 obj
<< /Type /Outlines
/Count 0
>>
endobj
3 0 obj
<< /Type /Pages
/Kids [4 0 R]
/Count 1
>>
endobj
4 0 obj
<< /Type /Page
/Parent 3 0 R
/MediaBox [0 0 595 842]
/Contents 5 0 R
/Resources << /ProcSet 6 0 R
/Font << /F1 7 0 R >>
>>
>>
endobj
5 0 obj
<< /Length 73 >>
stream
BT
/F1 24 Tf
100 100 Td
(Hello World) Tj
ET
endstream
endobj
6 0 obj
[ /PDF /Text ]
endobj
7 0 obj
<< /Type /Font
/Subtype /Type1
/Name /F1
/BaseFont /Helvetica
/Encoding /MacRomanEncoding
>>
endobj
xref
0 8
0000000000 65535 f
0000000009 00000 n
0000000074 00000 n
0000000120 00000 n
0000000179 00000 n
0000000364 00000 n
0000000466 00000 n
0000000496 00000 n
trailer
<< /Size 8
/Root 1 0 R
>>
startxref
625
%%EOF
Ok, I think I got it.
Try this: open your TCPDF-created PDF and remove all occurenecs of viewerpreferences and any box-statements other than the MediaBox... doing so finally resulted in a print-default-scaling-free PDF :)
seams like those additional infos -intended for professional printing- only confuse the common pdf-viewer instead of helping with anything :)
Goto tcpdf.php and change line 8529 in method _putpages as follows
change
$boxes = array('MediaBox', 'CropBox', 'BleedBox', 'TrimBox', 'ArtBox');
into
$boxes = array('MediaBox');
In my PDF-output this instantly removed the scaling problem :)