PDF sign and resign is not recognized when the signatures are visible - pdf

The idea is to be able to sign a PDF file multiple times with my own PDF parser.
Reference files: here.
When the signature isn't going to be visible, all work ok. I sign 1.pdf once (2.pdf) and then twice (3.pdf), Adobe Acrobat recognizes the signature.
The problem arises when the signature should be visible. The first signing works correctly (2.pdf). However the second (3.pdf) fails, Acrobat says the first signature is invalidated and the second is not recognized.
As far as I can tell, the only difference between visible and invisible is the adding of the text object. Why adobe invalidates the first signature and why the second isn't recognized?
28 0 obj
<</BaseFont/Helvetica/Type/Font/Subtype/Type1/Encoding/WinAnsiEncoding/Name/Helv>>
endobj
29 0 obj
<</BaseFont/ZapfDingbats/Type/Font/Subtype/Type1/Name/ZaDb>>
endobj
31 0 obj<</Font 32 0 R>>
endobj
32 0 obj<</FAdESFont2 33 0 R>>
endobj
33 0 obj<</Type /Font /Subtype /Type1 /BaseFont /Helvetica>>
endobj
34 0 obj
<</Length 90>>stream
BT
6 760 TD
/FAdESFont2 6 Tf
(m#turboirc.com MICHAIL CHOURDAKIS 1/23/2023 17:24:10) Tj
ET
endstream
endobj
26 0 obj
<</Type/XObject/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/Subtype/Form/BBox[0 0 0 0]/Matrix [1 0 0 1 0 0]/Length 8/FormType 1/Filter/FlateDecode>>stream
xœ
endstream
endobj
3 0 obj
<</Contents[34 0 R 24 0 R 12 0 R]/CropBox[0.0 0.0 612.0 792.0]/MediaBox[0.0 0.0 612.0 792.0]/Parent 2 0 R/Resources 13 0 R/Rotate 0/Type/Page/Annots[17 0 R 27 0 R]>>
endobj
2 0 obj
<</Count 1/Kids[3 0 R]/Type/Pages>>
endobj
1 0 obj
<</AcroForm<</Fields[17 0 R 27 0 R]/DR<</Font<</Helv 28 0 R/ZaDb 29 0 R>>>>/DA(/Helv 0 Tf 0 g )/SigFlags 3>>/AcroForm<</Fields[17 0 R]/DR<</Font<</Helv 18 0 R/ZaDb 19 0 R>>>>/DA(/Helv 0 Tf 0 g )/SigFlags 3>>/Pages 2 0 R/Type/Catalog>>
endobj
14 0 obj
<</Producer(AdES Tools https://www.turboirc.com)/ModDate(D:20230123152410+00'00')>>
endobj
xref

Why adobe invalidates the first signature and why the second isn't recognized?
Because you add the visualizations of the signatures in an inappropriate way.
You add visualizations of the signatures by adding to the static page content (the page content streams). This is the wrong approach if you want to be able to add signatures to already signed PDFs, because manipulation of the static page content after signing is a forbidden change, see this answer.
The appropriate way to add visualizations of PDF signatures is by adding an appearance stream to the respective signature field widget.
For details you may want to study the PDF specification ISO 32000.

Related

Visible signature in PDF file (2)

Continuing from this question, the PDF is now constructed as such:
8 0 obj
<</F 132/Type/Annot/Subtype/Widget/Rect[2 198 100 190]/FT/Sig/DR<<>>/T(Signature1)/V 6 0 R/P 3 0 R/AP<</N 7 0 R>>>>
endobj
6 0 obj
<</Contents <...>/Type/Sig/SubFilter/ETSI.CAdES.detached/M(D:20230128131946+00'00')/ByteRange [0 830 60832 1714]/Filter/Adobe.PPKLite>>
endobj
9 0 obj
<</BaseFont/Helvetica/Type/Font/Subtype/Type1/Encoding/WinAnsiEncoding/Name/Helv>>
endobj
10 0 obj
<</BaseFont/ZapfDingbats/Type/Font/Subtype/Type1/Name/ZaDb>>
endobj
12 0 obj<</Font 13 0 R>>
endobj
13 0 obj<</FAdESFont1 14 0 R>>
endobj
14 0 obj<</Type /Font /Subtype /Type1 /BaseFont /Helvetica>>
endobj
15 0 obj
<</Length 90>>stream
BT
2 194 TD
/FAdESFont1 5 Tf
(m#turboirc.com MICHAIL CHOURDAKIS 1/28/2023 15:19:46) Tj
ET
endstream
endobj
7 0 obj
<</Type/XObject/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/Subtype/Form/BBox[2 198 100 190]/Length 90/FormType 1/Filter/FlateDecode>>stream
BT
2 194 TD
/FAdESFont1 5 Tf
(m#turboirc.com MICHAIL CHOURDAKIS 1/28/2023 15:19:46) Tj
ET
endstream
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Resources<</Font<</F1 4 0 R>>>>/Contents 5 0 R/Annots[8 0 R]>>
endobj
2 0 obj
<</Type/Pages/MediaBox[0 0 200 200]/Count 1/Kids[3 0 R]>>
endobj
1 0 obj
<</AcroForm<</Fields[8 0 R]/DR<</Font<</Helv 9 0 R/ZaDb 10 0 R>>>>/DA(/Helv 0 Tf 0 g )/SigFlags 3>>/Type/Catalog/Pages 2 0 R>>
endobj
11 0 obj
<</Producer(AdES Tools https://www.turboirc.com)/ModDate(D:20230128131946+00'00')>>
endobj
xref
0 4
0000000000 65535 f
0000061862 00000 n
0000061787 00000 n
0000061681 00000 n
6 10
0000000810 00000 n
0000061409 00000 n
0000000679 00000 n
0000060958 00000 n
0000061056 00000 n
0000062004 00000 n
0000061133 00000 n
0000061165 00000 n
0000061203 00000 n
0000061271 00000 n
trailer
<</Root 1 0 R/Prev 492/Info 11 0 R/Size 20/ID[<6BD3BF95416A5C19FFBC464EC610875C><54ACC00AA74869363131BCC04E65417F>]>>
startxref
62104
%%EOF
The idea is:
Create the annotation object (ID 8) which refers to the signature /V (6) and something to show ? /N (8).
The annotation object is a stream containing the text?
7 0 obj <</Type/XObject/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/Subtype/Form/BBox[2 198 100 190]/Length 90/FormType 1/Filter/FlateDecode>>stream
BT
2 194 TD
/FAdESFont1 5 Tf
(m#turboirc.com MICHAIL CHOURDAKIS 1/28/2023 15:19:46) Tj
ET
endstream
endobj
This time adobe accepts the signature and has a "box" in which I can click to show signature information, but the text (mail name date) is not displayed.
What am I missing?
In the previous mode I was changing the content of the original root by I learned from this question that this is an incorrect way of adding a visible signature and will not work for re-signing.
Your appearance stream in object 7 has some errors, in particular
Its resources dictionary does not contain a fonts section; so how should the text in it be rendered?
It claims to be flate-encoded but obviously is not.

How are pdfs build?

Where do I find information about how a pdf is made up?
For example: A pdf I created named Dokname containing the string TEST opend in a text-editor looks like this:
(I replaced the parts the text-editor couldn't decode with [...])
%PDF-1.4
%Óëéá
1 0 obj
<</Title (Dokname)
/Producer (Skia/PDF m102 Google Docs Renderer)>>
endobj
3 0 obj
<</ca 1
/BM /Normal>>
endobj
5 0 obj
<</Filter /FlateDecode
/Length 160>> stream
[...]
endstream
endobj
2 0 obj
<</Type /Page
/Resources <</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]
/ExtGState <</G3 3 0 R>>
/Font <</F4 4 0 R>>>>
/MediaBox [0 0 596 842]
/Contents 5 0 R
/StructParents 0
/Parent 6 0 R>>
endobj
6 0 obj
<</Type /Pages
/Count 1
/Kids [2 0 R]>>
endobj
7 0 obj
<</Type /Catalog
/Pages 6 0 R>>
endobj
8 0 obj
<</Length1 14972
/Filter /FlateDecode
/Length 7164>> stream
[...]
endstream
endobj
9 0 obj
<</Type /FontDescriptor
/FontName /AAAAAA+ArialMT
/Flags 4
/Ascent 905.27344
/Descent -211.91406
/StemV 45.898438
/CapHeight 715.82031
/ItalicAngle 0
/FontBBox [-664.55078 -324.70703 2000 1005.85938]
/FontFile2 8 0 R>>
endobj
10 0 obj
<</Type /Font
/FontDescriptor 9 0 R
/BaseFont /AAAAAA+ArialMT
/Subtype /CIDFontType2
/CIDToGIDMap /Identity
/CIDSystemInfo <</Registry (Adobe)
/Ordering (Identity)
/Supplement 0>>
/W [0 [750] 40 54 666.99219 55 [610.83984]]
/DW 0>>
endobj
11 0 obj
<</Filter /FlateDecode
/Length 243>> stream
[...]
endstream
endobj
4 0 obj
<</Type /Font
/Subtype /Type0
/BaseFont /AAAAAA+ArialMT
/Encoding /Identity-H
/DescendantFonts [10 0 R]
/ToUnicode 11 0 R>>
endobj
xref
0 12
0000000000 65535 f
0000000015 00000 n
0000000365 00000 n
0000000098 00000 n
0000008721 00000 n
0000000135 00000 n
0000000573 00000 n
0000000628 00000 n
0000000675 00000 n
0000007925 00000 n
0000008159 00000 n
0000008407 00000 n
trailer
<</Size 12
/Root 7 0 R
/Info 1 0 R>>
startxref
8860
%%EOF
What do these obj-elements represent? Where is my TEST? Why did it get scrambled?
What I am searching for can probably all be found in adobe's documentations, but those have hundreds of pages which is very overwhelming. I get that this is a very complex topic and I am not trying to understand it completely. Just looking for an introduction or an overview. Unfontunately I didn't find anything like that on youtube or elsewhere..
Too complex for comments and yes you will only find snippets here and there including this and bits in my and others answers.
For a quick overview of the code sample you provided
A pdf is a collection of objects which are placed in no sequential order. So you start at the end before the last %%EOF (potentially one of many !) with startxref 8860 where 8860 is the decimal address of the Cross(XRef)erence table i.e. the files index.
There are many abbreviations (too many to list) and like a stack language most things may appear (literally) backwards so the xref points to each objects position in the file.
The prime target in this case is 7 0 obj <</Type /Catalog /Pages 6 0 R>> endobj since the catalog tells us about where the number of following pages will be found thus in object 6 /Pages /Count 1 /Kids [2 0 R] so its one page further defined in 2 0 obj
We now see there is an image and font(s) placed within /MediaBox [0 0 596 842] which is roughly (a tad wider) than a standard A4 page since 595/72" is closer to 210 mm.
Too much to describe about that one item alone, so skipping to Where is your text? and we see /Contents 5 0 R so that compressed stream of data that you need to decode is most likely your text but the length (/Length 160) is the binary flate encoded stream with placements not just your raw plain text.
The quantity of date sub setting the font seems odd and excessive for just 4 letters (if it was similar Helvetica it would not need including nor breaking the font as CID ArialMT) and without the full file its hard to say why the words /Image* is there, but it is Google Docs Renderer!
My suspicion is we may see characteristics of OCR in that stream.

Set User Units when generating a PDF

I need to change the default user unit in a generated pdf file. Here is a minimal example which displays, but without the correct document size.
%PDF-1.7
1 0 obj
<< /Type /Catalog
/Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page
/Parent 2 0 R
/UserUnit 2.83
/MediaBox [0 0 2440 1220]
/Contents 4 0 R >>
endobj
4 0 obj
<< /Length 44 >>
stream
0.3 0.5 0.2 0.1 k
100 100 400 400 re
f
endstream
endobj
xref
0 5
0000000000 65535 f
0000000009 00000 n
0000000058 00000 n
0000000117 00000 n
0000000221 00000 n
trailer
<< /Size 5
/Root 1 0 R >>
startxref
309
%%EOF
If you open this file in a PDF viewer, it's as if the UserUnit default has not been changed.
I need to get the user units as close to millimetres as possible. The graphics in this file are to be printed onto board then cut out with a CNC machine so there needs to be some level of accuracy with the printing.
How do you set the UserUnit value correctly?
Never assume Apple Preview does the correct thing with PDF files.
If you open this in Adobe Acrobat, the reported page size is 2436 x 1218mm, which I believe is correct for your UserUnit value.
The box looks the same size proportionally as what is shown in Preview, so I'm going to assume that one is drawn correctly as well.

What is the smallest possible valid PDF?

Out of simple curiosity, having seen the smallest GIF, what is the smallest possible valid PDF file?
This is an interesting problem. Taking it by the book, you can start off with this:
%PDF-1.0
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj 2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj 3 0 obj<</Type/Page/MediaBox[0 0 3 3]>>endobj
xref
0 4
0000000000 65535 f
0000000010 00000 n
0000000053 00000 n
0000000102 00000 n
trailer<</Size 4/Root 1 0 R>>
startxref
149
%EOF
which is 291 bytes of PDF joy. Acrobat opens it, but it complains somewhat. There is one page in it and it is 3/72" square, the minimum allowed by the spec.
However, Acrobat X doesn't even bother with the cross reference table anymore, so we can take that out:
%PDF-1.0
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj 2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj 3 0 obj<</Type/Page/MediaBox[0 0 3 3]>>endobj
trailer<</Size 4/Root 1 0 R>>
Acrobat complains, but opens it. Now we're at 178 bytes.
Turns out that you don't need that /Size in the trailer. Now we're at 172:
%PDF-1.0
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj 2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj 3 0 obj<</Type/Page/MediaBox[0 0 3 3]>>endobj
trailer<</Root 1 0 R>>
Turns out you don't need all those pesky /Type elements in your dictionaries:
%PDF-1.0
1 0 obj<</Pages 2 0 R>>endobj 2 0 obj<</Kids[3 0 R]/Count 1>>endobj 3 0 obj<</MediaBox[0 0 3 3]>>endobj
trailer<</Root 1 0 R>>
Now we're at 138 bytes.
It also turns out that when the spec says "shall be an indirect reference" and /Count is required, and the header "must" be %PDF-1.0, they're making loose suggestions. This is the smallest I could make it and have it openable in Acrobat X:
%PDF-1.
trailer<</Root<</Pages<</Kids[<</MediaBox[0 0 3 3]>>]>>>>>>
70 bytes.
Now, my editor uses Windows newline discipline, but Acrobat accepts Windows, Mac, or Unix conventions, so by using a hex editor, I replaced the \r\n with \r and removed the last newline altogether, which leaves me with 67 bytes
25 50 44 46 2D 31 2E 0D 74 72 61 69 6C 65 72 3C
3C 2F 52 6F 6F 74 3C 3C 2F 50 61 67 65 73 3C 3C
2F 4B 69 64 73 5B 3C 3C 2F 4D 65 64 69 61 42 6F
78 5B 30 20 30 20 33 20 33 5D 3E 3E 5D 3E 3E 3E
3E 3E 3E
I tried taking off the last end dictionary (>>), but Acrobat wouldn't have that. The PDF reading built-in to Google Chrome (FoxIt) won't open it.
As a PostScript (HA! See what I did there?), if you consent to Acrobat "repairing" the file, it bumps up to 3550 bytes, most of it optional metadata, but it leaves behind a number of clear spec violations.
I could not get the hello world example to open.
For a small-ish file with text content :
%PDF-1.2
9 0 obj
<<
>>
stream
BT/ 9 Tf(Test)' ET
endstream
endobj
4 0 obj
<<
/Type /Page
/Parent 5 0 R
/Contents 9 0 R
>>
endobj
5 0 obj
<<
/Kids [4 0 R ]
/Count 1
/Type /Pages
/MediaBox [ 0 0 99 9 ]
>>
endobj
3 0 obj
<<
/Pages 5 0 R
/Type /Catalog
>>
endobj
trailer
<<
/Root 3 0 R
>>
%%EOF
Based on all the answers here, here's the smallest PDF with text:
SMALL_PDF = (
b"%PDF-1.2 \n"
b"9 0 obj\n<<\n>>\nstream\nBT/ 32 Tf( YOUR TEXT HERE )' ET\nendstream\nendobj\n"
b"4 0 obj\n<<\n/Type /Page\n/Parent 5 0 R\n/Contents 9 0 R\n>>\nendobj\n"
b"5 0 obj\n<<\n/Kids [4 0 R ]\n/Count 1\n/Type /Pages\n/MediaBox [ 0 0 250 50 ]\n>>\nendobj\n"
b"3 0 obj\n<<\n/Pages 5 0 R\n/Type /Catalog\n>>\nendobj\n"
b"trailer\n<<\n/Root 3 0 R\n>>\n"
b"%%EOF"
)
As base64. Copy this and test in Chrome:
data:application/pdf;base64,JVBERi0xLjIgCjkgMCBvYmoKPDwKPj4Kc3RyZWFtCkJULyAzMiBUZiggIFlPVVIgVEVYVCBIRVJFICAgKScgRVQKZW5kc3RyZWFtCmVuZG9iago0IDAgb2JqCjw8Ci9UeXBlIC9QYWdlCi9QYXJlbnQgNSAwIFIKL0NvbnRlbnRzIDkgMCBSCj4+CmVuZG9iago1IDAgb2JqCjw8Ci9LaWRzIFs0IDAgUiBdCi9Db3VudCAxCi9UeXBlIC9QYWdlcwovTWVkaWFCb3ggWyAwIDAgMjUwIDUwIF0KPj4KZW5kb2JqCjMgMCBvYmoKPDwKL1BhZ2VzIDUgMCBSCi9UeXBlIC9DYXRhbG9nCj4+CmVuZG9iagp0cmFpbGVyCjw8Ci9Sb290IDMgMCBSCj4+CiUlRU9G
To make the page bigger, adjust the MediaBox dimensions :)
/MediaBox [ 0 0 250 50 ]
I thought I'd make a smallest pdf that displays "Hello World". The text is in the lower left corner. Sorry about the 9-point font, any larger would cost an extra byte :)
172 bytes for Adobe Reader X (if saved with linefeed-only newlines and no trailing newline or null-byte):
%PDF-1.
1 0 obj<</Kids[<</Parent 1 0 R/Resources<<>>/Contents 2 0 R>>]>>endobj 2 0 obj<<>>stream
BT/ 9 Tf(Hello World)' ET
endstream
endobj trailer<</Root<</Pages 1 0 R>>>>
120 bytes for Chrome's builtin PDF viewer:
%PDF 1 0 obj<</Pages<</Kids[<</Contents<<>>stream
BT 9 Tf(Hello World)' ET endstream>>]>>>>endobj trailer<</Root 1 0 R>>
To easily see this in Chrome, paste this URI in the address bar (SO won't let me link to it, and it won't work at all in other browsers):
data:application/pdf,%25PDF%201%200%20obj%3C%3C%2FPages%3C%3C%2FKids%5B%3C%3C%2FContents%3C%3C%3E%3Estream%0ABT%209%20Tf(Hello%20World)'%20ET%20endstream%3E%3E%5D%3E%3E%3E%3Eendobj%20trailer%3C%3C%2FRoot%201%200%20R%3E%3E
I was going to give an example of what I thought was the minimal valid "universal" PDF. until I noticed that the whole ethos of using a PDF is to ensure it will render exactly the same on all devices and their PDF readers. However on cross checking my "perfectly small well formed PDF" I spotted this. TL;DR this is fixed in my personal minimal text template (at the end)
So the ground rule was "smallest possible valid PDF" but I consider this shortage should count as an invalid PDF since it does not adhere to the concept of "Fit for Purpose" thus the minimum PDF must itself as a minimum contain a minimum of one means of fixing a working font.
To explain my proposed solution and why its less than perfect here it is in a rough form because of cut and paste.
%PDF-1.0
%µ¶
1 0 obj
<</Type/Catalog/Pages 2 0 R>>
endobj
2 0 obj
<</Kids[3 0 R]/Count 1/Type/Pages/MediaBox[0 0 595 792]>>
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Contents 4 0 R/Resources<<>>>>
endobj
4 0 obj
<</Length 58>>
stream
q
BT
/ 96 Tf
1 0 0 1 36 684 Tm
(Hello World!) Tj
ET
Q
endstream
endobj
xref
0 5
0000000000 65536 f
0000000016 00000 n
0000000062 00000 n
0000000136 00000 n
0000000209 00000 n
trailer
<</Size 5/Root 1 0 R>>
startxref
316
%%EOF
Whilst not defined by the rules of the question I have included some past experience of user problems.
The first difference you might note is media box in 2nd obj is a hybrid MediaBox[0 0 595 792] which is a minimax A4 width and minimax US Letter high, since otherwise the "universal page" in most countries would force a second sheet # 100% scale printing either for too wide or too high a page definition for the locale defaults.
And the current problem is evidenced in 3rd obj as no fonts have been set for resources, thus in aiming for minimal the PDF, I contest without a font defined, will be Invalid.
Thus none of the answers so far including my own, appear to produce a PDF that will "WORK" as a "VALID" means to produce the same printout, regardless of platform or viewer.
Turning to libraries I found a 3MB zip with an exceptionally versatile windows.exe (a single file that can do most pdf functions like split merge import stamp export attachments etc.) which can take "Hello World! in a command line and produce a good working file, this is page centre zoomed in
it uses a stream for the text and its positioning, and has other conforming data like producer so I offer this as a potentially good minimal to pare down, note as presented this file will appear blank due to stream corruption from binary to text.
%PDF-1.7
%µ¶
1 0 obj
<</Pages 2 0 R/Type/Catalog>>
endobj
2 0 obj
<</Count 1/Kids[5 0 R]/MediaBox[0 0 595 792]/Type/Pages>>
endobj
3 0 obj
<</BaseFont/Helvetica/Encoding/WinAnsiEncoding/Subtype/Type1/Type/Font>>
endobj
4 0 obj
<</Filter/FlateDecode/Length 101>>
stream
xœ*Tp
QÐw3P04Ò30PISp
Q01
à˜kdf¢ga¬`bhâ%ç‚ô(„”#©Aîè"EéÚlA
HW‘‚†GjNN¾Bx~QNŠ¢¦BHÈÞ## ÿÿFå
endstream
endobj
5 0 obj
<</Contents 4 0 R/CropBox[0 0 595 792]/MediaBox[0 0 595 792]/Parent 2 0 R/Resources<</Font<</F0 3 0 R>>>>/Type/Page>>
endobj
6 0 obj
<</CreationDate(D:20220600600709+01'00')/ModDate(D:20220600600709+01'00')/Producer(me 2)>>
endobj
xref
0 7
0000000000 65536 f
0000000016 00000 n
0000000062 00000 n
0000000136 00000 n
0000000225 00000 n
0000000395 00000 n
0000000529 00000 n
trailer
<</Size 7/Info 6 0 R/Root 1 0 R/ID[<A2A0CE5CCD9D0DABD5845AD574BF0A5C><09BF9D281BE12CB5B5933BB2B62B0D4D>]>>
startxref
636
%%EOF
P.S I deliberately added a non valid item so is intentionally not the minimum working answer, see if you can work out what's clearly wrong:-)
My personal offering
So I am often asked how to write plain text templated PDFs thus need the font to be static (Helvetica or Courier should do) and a structure that is easy to modify using windows CMD line, so this suits my purpose its now 698 bytes as shown with two place holders to show multi-line so if needed can find and replace Helvetica with Courier (note intentional 2 spaces after to keep byte count)
%PDF-1.1
%âã
1 0 obj
<</Type/Catalog/Pages<</Type/Pages/Count 1/Kids[2 0 R]>>>>
endobj
2 0 obj
<</Type/Page/Parent 1 0 R/MediaBox[0 0 594 792]/Resources<</Font<</F1 3 0 R>>/ProcSet[/PDF/Text]>>/Contents 4 0 R>>
endobj
3 0 obj
<</Type/Font/Subtype/Type1/Name/F1/BaseFont/Helvetica>>
endobj
4 0 obj
<</Length 5 0 R>>
stream
BT
/F1 36 Tf
1 0 0 1 255 752 Tm
48 TL
( Hello)'
(World!)'
ET
endstream
endobj
5 0 obj
78
endobj
xref
0 6
0000000000 65536 f
0000000017 00000 n
0000000094 00000 n
0000000228 00000 n
0000000302 00000 n
0000000425 00000 n
trailer
<</Size 6/Info <</CreationDate(D:2023)/Producer(cmd2pdf)/Title(mini.pdf)>>/Root 1 0 R>>
startxref
446
%%EOF
To see how this approach works in windows command line RIGHT CLICK and download as text https://github.com/GitHubRulesOK/MyNotes/raw/master/MAKE-PDF.cmd (now 200 lines long!) NOTE browser security may ask you to trust a cmd as download thus use .txt extension and you will still need to change properties to UNBLOCK once you are happy it should do no harm to run it!
#mkl are you up for producing your best shot ?
According to this Ange Albertini lecture, the smallest possible valid PDF is 36 bytes:
%PDF-(NULL)trailer<</Root<</Pages<<>>>>>>
Where (NULL) is the unprintable ASCII 0 character.
However, as Ange notes, while this PDF is technically valid, most PDF reader apps will regard it as invalid based on the size alone, thus failing to open it.
I needed a PDF version which is usable by a PDF converter (A4 format issue.. all the above constructs worked with Adobe Reader and Chrome, but not with the PDF converter which required DIN A4).
I found this site and this PDF worked fine with the PDF converter I'm using: https://help.callassoftware.com/m/73261/l/798383-how-to-create-a-simple-pdf-file
Working for a PDF related company, I know that the following content will be working pretty well. This is a valid empty A4 page:
%PDF-1.4
%âãÏÓ
5 0 obj
<<
/Length 1
>>
stream
endstream
endobj
4 0 obj
<<
/Type /Page
/MediaBox [0 0 612 792]
/Resources <<
>>
/Contents 5 0 R
/Parent 2 0 R
>>
endobj
2 0 obj
<<
/Type /Pages
/Kids [4 0 R]
/Count 1
>>
endobj
1 0 obj
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj
3 0 obj
<<
/Creator (PDF Creator http://www.pdf-tools.com)
/CreationDate (D:20150701112447+02'00')
/ModDate (D:20220607183602+02'00')
/Producer (3-Heights\222 PDF Optimization Shell 6.0.0.0 \(http://www.pdf-tools.com\))
>>
endobj
xref
0 6
0000000000 65535 f
0000000226 00000 n
0000000169 00000 n
0000000275 00000 n
0000000065 00000 n
0000000015 00000 n
trailer
<<
/Size 6
/Root 1 0 R
/Info 3 0 R
/ID [<1C3500CA9F7232B97E0EF3F789E8B7F2> <254C8D153F655D49945EAD68D801E011>]
>>
startxref
505
%%EOF
Now using Javascript, you can embed this into your js bundle. First encode in base64 the content above, then use the encoded string and create a Blob file with it by writing:
const str = 'JVBERi0xLjQKJcOiw6PDj8OTCjUgMCBvYmoKPDwKL0xlbmd0aCAxCj4+CnN0cmVhbQogCmVuZHN0cmVhbQplbmRvYmoKNCAwIG9iago8PAovVHlwZSAvUGFnZQovTWVkaWFCb3ggWzAgMCA2MTIgNzkyXQovUmVzb3VyY2VzIDw8Cj4+Ci9Db250ZW50cyA1IDAgUgovUGFyZW50IDIgMCBSCj4+CmVuZG9iagoyIDAgb2JqCjw8Ci9UeXBlIC9QYWdlcwovS2lkcyBbNCAwIFJdCi9Db3VudCAxCj4+CmVuZG9iagoxIDAgb2JqCjw8Ci9UeXBlIC9DYXRhbG9nCi9QYWdlcyAyIDAgUgo+PgplbmRvYmoKMyAwIG9iago8PAovQ3JlYXRvciAoUERGIENyZWF0b3IgaHR0cDovL3d3dy5wZGYtdG9vbHMuY29tKQovQ3JlYXRpb25EYXRlIChEOjIwMTUwNzAxMTEyNDQ3KzAyJzAwJykKL01vZERhdGUgKEQ6MjAyMjA2MDcxODM2MDIrMDInMDAnKQovUHJvZHVjZXIgKDMtSGVpZ2h0c1wyMjIgUERGIE9wdGltaXphdGlvbiBTaGVsbCA2LjAuMC4wIFwoaHR0cDovL3d3dy5wZGYtdG9vbHMuY29tXCkpCj4+CmVuZG9iagp4cmVmCjAgNgowMDAwMDAwMDAwIDY1NTM1IGYKMDAwMDAwMDIyNiAwMDAwMCBuCjAwMDAwMDAxNjkgMDAwMDAgbgowMDAwMDAwMjc1IDAwMDAwIG4KMDAwMDAwMDA2NSAwMDAwMCBuCjAwMDAwMDAwMTUgMDAwMDAgbgp0cmFpbGVyCjw8Ci9TaXplIDYKL1Jvb3QgMSAwIFIKL0luZm8gMyAwIFIKL0lEIFs8MUMzNTAwQ0E5RjcyMzJCOTdFMEVGM0Y3ODlFOEI3RjI+IDwyNTRDOEQxNTNGNjU1RDQ5OTQ1RUFENjhEODAxRTAxMT5dCj4+CnN0YXJ0eHJlZgo1MDUKJSVFT0Y=';
const blob = new Blob([atob(str)], { type: 'application/pdf' });
In Java, use this:
private static String samplepdf = "255044462D312E0D747261696C65723C3C2F526F6F743C3C2F50616765733C3C2F4B6964735B3C3C2F4D65646961426F785B302030203320335D3E3E5D3E3E3E3E3E3E";
and then
byte[] bytes = hexStringToByteArray(samplepdf);
...
public byte[] hexStringToByteArray(String s) {
int len = s.length();
byte[] data = new byte[len / 2];
for (int i = 0; i < len; i += 2) {
data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
+ Character.digit(s.charAt(i + 1), 16));
}
return data;
}

Writing multiline text in pdf page

I want to write a multiline text, I've tried this:
6 0 obj
<</Length 59>>
stream
BT /F1 24 Tf 100 520 Td (This is test\n This is test)Tj ET
endstream
endobj
But I am not getting a new line. Is there a simple way to achieve that or I must create another stream with position of the next line?
This is the full code:
%PDF-1.5
1 0 obj <</Type /Catalog /Pages 2 0 R>>
endobj
2 0 obj <</Type /Pages /Kids [3 0 R] /Count 1>>
endobj
3 0 obj<</Type /Page /Parent 2 0 R /Resources 4 0 R /MediaBox [0 0 500 700] /Contents 6 0 R>>
endobj
4 0 obj<</Font <</F1 5 0 R>>>>
endobj
5 0 obj<</Type /Font /Subtype /Type1 /BaseFont /Helvetica>>
endobj
6 0 obj
<</Length 75>>
stream
BT
/F1 24 Tf
100 520 Td
(This is test) Tj
T*
(This is test) Tj
ET
endstream
endobj
xref
0 7
0000000000 65535 f
0000000009 00000 n
0000000059 00000 n
0000000116 00000 n
0000000219 00000 n
0000000259 00000 n
0000000328 00000 n
trailer <</Size 7/Root 1 0 R>>
startxref
454
%%EOF
You may want to do something like this:
BT
/F1 24 Tf
30 TL
100 520 Td
(This is test) Tj
T*
(This is test) Tj
ET
or the shorter form:
BT
/F1 24 Tf
30 TL
100 520 Td
(This is test) Tj
(This is test) '
ET
You might want to read up on section 9.4.3 Text-Showing Operators in the PDF specification ISO 32000-1.
P.S.: Added text leading TL operators.