How can I fix the damage in PDF files requested using MS2XML.XMLHTTP? - pdf

Friends, this is my first question here... I've been facing some problems when downloading a PDF buffer using MS2XML.XMLHTTP. I've been using Genexus to do so but I also tried right in pure Visual Fox Pro. The problem is that when I send the ResponseText to a string variable, some characters are replaced by question marks, the sam happens when I send the ResponseText to a pdf or txt file. The object created in MS2XML.XMLHTTP.6.0 does not allow using the ResponseBody property. Any thoughts on how could I solve it using MS2XML.XMLHTTP? Thanks.
oHTTP = CreateObject("MSXML2.XMLHTTP.6.0")
oHTTP.Open("GET", 'https://homologacao.plugboleto.com.br/api/v1/boletos/impressa /lote/NIKLfYBWz',.F.)
oHTTP.setRequestHeader("content-type", "application/pdf")
oHTTP.Send()
? oHTTP.responseText
I've received someething like the following (full of question marks):
%PDF-1.4 %??2 0 obj <</ColorSpace/DeviceRGB/Subtype/Image/Height 38/Filter/DCTDecode/Type/XObject/Width 149/BitsPerComponent 8/Length 2619>>stream ???JFIF H H ??C  !"$"$??C?? & ?" ?? ??? } !1AQa"q2???#B??R?$3br? %&'()456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz???????????????????????????????????????????????????? ??? w !1AQaq"2?B????#3R??$4?&'()56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz??????????????????????????????????????????????????? ? ?h???8OO????a&??G?3? ?p1???|b?o?? ??x?_???%??E?en9??T???T>.????JG??rx??????????h?w????????:?!?????????jlm?Tn???????u??? ??Ey?PA?? (?? (?? (?? (??>B???;.??3?e??J??~?F??? y,s??i???#?m=kw???? ?[?K#????vR#G??^$?????k?[??BSu??#???M??????? _??Z?Fo??????/??*x?¾ mn??{)???80??s]W?x? ??+??k??=????????8 ?|D?c?j???h???$?8???:c??(???M/?Ze??;O?[?J????? '?~/j~!???n?urm???1^ITl;?3?%[?b??~?&<=u?Y\x??W6?¬$2?q?1?;??qc??_??qk>?&?v?????,??F?{??x???s??????{?k????r8.??<P?|,????q?]I?]?e???p;??/?W?x??)???A?????&)??dc,?d7?J?s??m?>???????!??9Va?? c???Zv??x+?b?wd??f?8a????????,6????????x?? ?-????<9F???????[~$?{??o???X??????y?ZgQ?#8??ox;? ???|??mZ?? I~a?k ~P ?? j??? '?c??4?F??l??$?8???(?'?"?.?????,????9V?????d???????UU)??? ???o?&???4?7?Z?? g???y?
W[?????d?Q$?#??^mZ???B Z(??QE ?Q#Q#Q#Q#Q#??endstream endobj 6 0 obj <</Filter/FlateDecode/Length 846>>stream x??V???+?q??z???'U./h???{???d?U??xf??PQ?O???unA???x1?0 50]#?\?T?y?B?s?9B? ? 2|????C???2t????k?U??]??]{? ?????s?AH??????h?"w????? f?????i??? ??>?9?8?#??"??G?$???<??0???S?2??sn?n??^?5?\FN?o1?4?~4~??Qe=&?T[???????Z??x??????k?????0z'#?;?'a??a?f~?q?~8ZH~?m???????Mm?p?#hh{????W7??????
?8?Olk'?A|?[???P?5?????uGxRr#?pw<$y?n??kD ???0??ih??9?5v??0?_}iG?Dq?8??_U??5a?????k????d???M??2???C(??;t2uA]z ??6A??o?t?}d????[?<;??R?iO8n??f???40???S?aVX????Y?p2N?eq]N?VeE?>??/V0?]MV?&???.aZ-???z2???????????8o??3?S????????gf??B?'6??]?J endstream endobj 8 0 obj<</Contents 6 0 R/Type/Page/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]/Font<</F1 3 0 R/F2 4 0 R>>/XObject<</Xf1 1 0 R/Xf2 5 0 R/img0 2 0 R>>>>/Parent 7 0 R/MediaBox[0 0 595 842]>>endobj 3 0 obj<</Subtype/Type1/Type/Font/BaseFont/Helvetica-Bold/Encoding/WinAnsiEncoding>>endobj 4 0 obj<</Subtype/Type1/Type/Font/BaseFont/Helvetica/Encoding/WinAnsiEncoding>>endobj 5 0 obj <</Subtype/Form/Filter/FlateDecode/Type/XObject/Matrix [1 0 0 1 0 0]/FormType 1/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/BBox[0 0 292.41 39]/Length 474>>stream x?m??e1C?#???1Ly}??Ua??>????r?R?????r?7gr??a???\??PTj??p???s????~m"???:K??T???1????Gw({c???? !???p?rB g M?QG*?PC ?o??v?????'n[!n2??}*?g}r?G??J?R"aI?S??q ???d;??-??m?????y?lCp??[B(=?L??G[]2??)???
?8???9L????]y)?B??t<??E??????I????????#1?]$? ??h??6?Q[A)?8????<???????z??8c??????s??R????%6? endstream endobj 1 0 obj <</Subtype/Form/Filter/FlateDecode/Type/XObject/Matrix [1 0 0 1 0 0]/FormType 1/Resources<</ProcSet[/PDF/Text]/ExtGState 9 0 R/Font 10 0 R>>/Length 1818/BBox[0 0 595.28 419.53]>>stream x???Ko???t??V|???4-$n??{Pm%]6????P????:?$E?6i ?4??????d??m.U?7L????E??"???e?^r??c????????S#'?????????X?bz?k.J?3!?)??{?V ??'VS1?????8??L???? fU"&Fx?v?Q?9G??EL]?iLIN}?C?i~??4???J<??P?4Ec??F??P%c=!
?=?!U??P?T?b]???k>+¹&?5?9A5ai?"???G????H???J??J?#N??#?3dP??#O=A6%??&dO?eU&5;?Q?#M?'??.??8????P???z!
'??j??O?8??7?
?f????????u???^???:N#?q?Y?xN6Kjv B??Z?????<?? Dx^?J??;A1?3s /?S?k?8??'?9?n??.w?s????g????? M<0????????<?,p???xG!pv?v??O??,?!pv?v?P??l?O??3?M)[????????x??D?h????Z??&i)??,????k???k????j*???-?#?'?x9D)]?J:?=?G??1r? ???!???X?I???|n?q}?=?6?:ðl??????_T??[??_?AC???YI??????+??]??}f}S?P<{??EY??#??q?pah???,Pj?????v~??a?c???{R?7????? ?E~?mv??v?6??t ?? ??Y?????&???F?7P'?e?????R&??(?#????????)?2???P??j?.I??s4?|???s???$z????????E?P??x?{??tU?????????|??b?'?jH????f6 .?g? ?"?????iVR";;?P?'????F?????*??^?b?Nu6rO6? ?Xn[~>t???x2????n?[?D^????6C4O??vx??p?#???$?ru??Yj??55,?Z???u?&?yy????%????+????aMk?3 ???v?1M\A&?q???? '?Sf?,??ce)? ??x?????P?#?Ea&y????/n??~8j???????Co????????????%?? ????????5C???(?<??}???OA???a$?)J?`?!vd????T????D{,?}^?e?]]#?'#T?v??J??;??4?G?e???&b?Bl???K????.?t=s?i?;6.> ?????:?H??Z}:.V? ??) endstream endobj 9 0 obj<</R7 11 0 R/R9 12 0 R>>endobj 10 0 obj<</R8 13 0 R>>endobj 11 0 obj<</TK true/Type/ExtGState/BM/Normal/OPM 1>>endobj 12 0 obj<</Type/ExtGState/SA true>>endobj 13 0 obj<</Subtype/Type1/Type/Font/BaseFont/Helvetica/Encoding 14 0 R>>endobj 14 0 obj<</Type/Encoding/Differences[225/aacute/acircumflex/atilde 231/ccedilla 233/eacute/ecircumflex 243/oacute 245/otilde 250/uacute]>>endobj 7 0 obj<</Kids[8 0 R]/Type/Pages/Count 1>>endobj 15 0 obj<</Type/Catalog/Pages 7 0 R>>endobj 16 0 obj<<>>endobj xref 0 17 0000000000 65535 f 0000004765 00000 n 0000000015 00000 n 0000003908 00000 n 0000004000 00000 n 0000004087 00000 n 0000002787 00000 n 0000007190 00000 n 0000003700 00000 n 0000006794 00000 n 0000006833 00000 n 0000006863 00000 n 0000006922 00000 n 0000006965 00000 n 0000007044 00000 n 0000007240 00000 n 0000007285 00000 n trailer<</Info 16 0 R/ID []/Root 15 0 R/Size 17>>startxref 7305 %%EOF

Since a PDF is a binary file and not a text file, it is quite normal you would see ? and all sorts of other non-printable characters. Instead save it to a file on disk and open with something like ShellExecute. ie:
oHTTP = CreateObject("MSXML2.XMLHTTP.6.0")
oHTTP.Open("GET", 'https://homologacao.plugboleto.com.br/api/v1/boletos/impressa /lote/NIKLfYBWz',.F.)
oHTTP.setRequestHeader("content-type", "application/pdf")
oHTTP.Send()
Local lcFileName
lcFileName = Forcepath(Sys(2015)+'.pdf', Sys(2023))
Strtofile(oHttp.responseText, m.lcFileName)
Declare Long ShellExecute In "shell32.dll" ;
long HWnd, String lpszOp, ;
string lpszFile, String lpszParams, ;
string lpszDir, Long nShowCmd
ShellExecute(_vfp.HWnd,'',m.lcFileName,'','',1)
EDIT: It was not a job MSXML2.XmlHttp. You simply download the file as a PDF and open it:
Local lcFileName, lcRemote
lcRemote = 'https://homologacao.plugboleto.com.br/api/v1/boletos/impressao/lote/NIKLfYBWz'
lcFileName = Forcepath(Sys(2015)+'.pdf', Sys(2023))
If (getFileFromURL(m.lcRemote, m.lcFileName) = 0)
Declare Long ShellExecute In "shell32.dll" ;
long HWnd, String lpszOp, ;
string lpszFile, String lpszParams, ;
string lpszDir, Long nShowCmd
ShellExecute(_vfp.HWnd,'',m.lcFileName,'','',1)
Endif
Procedure getFileFromURL
Lparameters tcRemoteFile,tcLocalFile
Declare Integer URLDownloadToFile In urlmon.Dll;
INTEGER pCaller, String szURL, String szFileName,;
INTEGER dwReserved, Integer lpfnCB
Return URLDownloadToFile(0, m.tcRemoteFile, m.tcLocalFile, 0, 0)
endproc

Related

Coordinates extracted from PDF are not exact

I'm working on rendering a georeferenced pdf within a map, I was able to retrieve the geolocation information from the pdf, but the coordinates I receive are not correct, they are a few meters apart from the places they really should be.
Opening the same PDF in Avenza Maps, it indicates this list of coordinates, and these are correct:
[-26.413082, -51.561534, -26.435838, -51.561643, -26.435909, -51.543773,-26.413152, -51.543667]
In the format I'm doing (reading the PDF as a String and doing a RegEx) I get these values:
[-26.43302 -51.56133 -26.41418 -51.56124 -26.41424 -51.54409 -26.43309 -51.54418]
[-26.45579 -51.59842 -26.41777 -51.59822 -26.41811 -51.51036 -26.45613 -51.51053]
But unfortunately none of the two reflect in the correct place (as in avenza).
That said, I opened the PDF in Notepad and found other values (more related to conversion and information), and I believe that maybe there is some way to convert the coordinates that I got through this other information, to the correct coordinates.
Follow the informations:
<?xpacket end="w"?>
endstream
endobj
294 0 obj
3495
endobj
295 0 obj
/DeviceRGB
endobj
296 0 obj
<</Length 297 0 R>>stream
/GS_init gs
/Group_6 Do
endstream
endobj
297 0 obj
24
endobj
298 0 obj
<</ExtGState 2 0 R/ColorSpace << /CS_P 295 0 R >>/XObject << /Group_6 6 0 R >>>>endobj
299 0 obj
<</Type /Group/S /Transparency/CS 295 0 R/I false/K false>>endobj
300 0 obj
<</Type /Page/Parent 301 0 R/Contents 296 0 R/Resources 298 0 R/MediaBox [0 0 841.88808 1190.5488]/ArtBox [0 0 841.88808 1190.5488]/UserUnit 1/Group 299 0 R/VP[<</Type /Viewport/BBox [14.1732 147.400915455 822.0456 1133.350548016]/Name (þÿ T S B I I)/Measure<</Type /Measure/Subtype /GEO/Bounds [0 0 0 1 1 1 1 0 0 0]/GPTS [ -26.43302 -51.56133 -26.41418 -51.56124 -26.41424 -51.54409 -26.43309 -51.54418]/LPTS [ 0 0 0 1 1 1 1 0]/GCS<</Type /PROJCS/WKT (PROJCS["SIRGAS_2000_UTM_Zone_22S",GEOGCS["GCS_SIRGAS_2000",DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",10000000.0],PARAMETER["Central_Meridian",-51.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]])>>>>>><</Type /Viewport/BBox [14.1732 14.1732 239.961243463 122.688692878]/Name (þÿ R e f e r e n c i a _ M a p a)/Measure<</Type /Measure/Subtype /GEO/Bounds [0 0 0 1 1 1 1 0 0 0]/GPTS [ -26.45579 -51.59842 -26.41777 -51.59822 -26.41811 -51.51036 -26.45613 -51.51053]/LPTS [ 0 0 0 1 1 1 1 0]/GCS<</Type /PROJCS/WKT (PROJCS["SIRGAS_2000_UTM_Zone_22S",GEOGCS["GCS_SIRGAS_2000",DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",10000000.0],PARAMETER["Central_Meridian",-51.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]])>>>>>>]>>endobj
301 0 obj
<</Type /Pages/Kids [ 300 0 R ]/Count 1>>endobj
302 0 obj
<<>>endobj
303 0 obj
<</Type /Catalog/Pages 301 0 R/PageMode /UseNone/PageLayout /SinglePage/ViewerPreferences <</PrintScaling /None /FitWindow true /DisplayDocTitle true>>/OpenAction [300 0 R /Fit]/OCProperties<</OCGs [ 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R 15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 35 0 R 36 0 R 43 0 R 44 0 R 47 0 R 50 0 R 53 0 R 56 0 R 59 0 R 62 0 R 63 0 R 64 0 R 65 0 R 66 0 R 67 0 R 68 0 R 69 0 R 76 0 R 77 0 R 80 0 R 83 0 R 90 0 R 93 0 R 96 0 R 99 0 R 102 0 R 105 0 R 108 0 R 111 0 R 114 0 R 117 0 R 120 0 R 123 0 R 126 0 R 129 0 R 132 0 R 135 0 R 138 0 R 141 0 R 148 0 R 149 0 R 152 0 R 155 0 R 158 0 R 161 0 R 176 0 R ]/D<</Name (Layers Tree)/Order [ 176 0 R 161 0 R 158 0 R 148 0 R [ 155 0 R 152 0 R 149 0 R ] 141 0 R 138 0 R 135 0 R 132 0 R 129 0 R 126 0 R 123 0 R 120 0 R 117 0 R 114 0 R 111 0 R 108 0 R 105 0 R 102 0 R 99 0 R 96 0 R 93 0 R 90 0 R 83 0 R 80 0 R 62 0 R [ 76 0 R [ 77 0 R ] 63 0 R [ 69 0 R 68 0 R 67 0 R 64 0 R [ 66 0 R 65 0 R ] ] ] 10 0 R [ 59 0 R 43 0 R [ 56 0 R 53 0 R 50 0 R 47 0 R 44 0 R ] 11 0 R [ 36 0 R 35 0 R 22 0 R 21 0 R 12 0 R [ 20 0 R 19 0 R 18 0 R 17 0 R 16 0 R 15 0 R 14 0 R 13 0 R ] ] ] ]/ListMode /VisiblePages>>>>/Metadata 293 0 R>>endobj
304 0 obj
<</Type/XRef/Size 305/W[1 4 2]/Filter/FlateDecode/Info 292 0 R/Root 303 0 R/ID [<c9167b70223726438d277b1b4409c053> <c9167b70223726438d277b1b4409c053>]/Length 923>>stream
I needed someone to tell me some way to get the correct coordinates, I hope this information helps to find
The PDF content in your question includes two ViewPort dictionaries.
These dictionaries map a location on the page ("BBox")
onto the GPTS referencing the specified WKT.
This is covered in the PDF 2.0 reference ISO-32000-2 section 12.9 & 12.10.
Unfortunately, this spec is not freely available, and it's not cheap.
Here are some definitions from the spec:
BBox:
A rectangle in default user space coordinates specifying the location of the viewport on the page.
The two coordinate pairs of the rectangle shall be specified in normalised form; that is, lower-left followed by upper-right, relative to the measuring coordinate system. This ordering shall determine the orientation of the measuring coordinate system (that is, the direction of the positive x and y axes) in this viewport, which may have a different rotation from the page.
GPTS:
(Required; PDF 2.0) An array of numbers that shall be taken pairwise, defining points in geographic space as degrees of latitude and longitude, respectively when defining a geographic coordinate system. These values shall be based on the geographic coordinate system described in the GCS dictionary. When defining a projected coordinate system, this array contains values in a planar projected coordinate space as eastings and northings. For Geospatial3D, when Geospatial feature information is present (requirement type Geospatial3D) in a 3D annotation, the GPTS array is required to hold 3D point coordinates as triples rather than pairwise where the third value of each tripe is an elevation value.
NOTE 2 Any projected coordinate system includes an underlying geographic coordinate system.
WKT:
A string of Well Known Text describing the geographic coordinate system.
The assumption is, if you're interested in Geospatial coordinates,
then you know what a WKT is, and what the projection means.
This may be enough information for you to map the geo coordinates for the
separate viewports to their locations on the page.
Here are the PDF Viewports in more readable form:
/VP [
<<
/Type
/Viewport
/BBox [14.1732 147.400915455 822.0456 1133.350548016]
/Name (TSBII)
/Measure <<
/Type
/Measure
/Subtype
/GEO
/Bounds [0 0 0 1 1 1 1 0 0 0]
/GPTS [ -26.43302 -51.56133 -26.41418 -51.56124
-26.41424 -51.54409 -26.43309 -51.54418]
/LPTS [ 0 0 0 1 1 1 1 0]
/GCS<<
/Type
/PROJCS
/WKT (
PROJCS["SIRGAS_2000_UTM_Zone_22S",
GEOGCS["GCS_SIRGAS_2000",
DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433]
],
PROJECTION["Transverse_Mercator"],
PARAMETER["False_Easting",500000.0],
PARAMETER["False_Northing",10000000.0],
PARAMETER["Central_Meridian",-51.0],
PARAMETER["Scale_Factor",0.9996],
PARAMETER["Latitude_Of_Origin",0.0],
UNIT["Meter",1.0]
]
)
>>
>>
>>
<<
/Type
/Viewport
/BBox [14.1732 14.1732 239.961243463 122.688692878]
/Name (Referencia_Mapa)
/Measure <<
/Type
/Measure
/Subtype
/GEO
/Bounds [0 0 0 1 1 1 1 0 0 0]
/GPTS [ -26.45579 -51.59842 -26.41777 -51.59822
-26.41811 -51.51036 -26.45613 -51.51053]
/LPTS [ 0 0 0 1 1 1 1 0]
/GCS<<
/Type
/PROJCS
/WKT (
PROJCS["SIRGAS_2000_UTM_Zone_22S",
GEOGCS["GCS_SIRGAS_2000",
DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433]
],
PROJECTION["Transverse_Mercator"],
PARAMETER["False_Easting",500000.0],
PARAMETER["False_Northing",10000000.0],
PARAMETER["Central_Meridian",-51.0],
PARAMETER["Scale_Factor",0.9996],
PARAMETER["Latitude_Of_Origin",0.0],
UNIT["Meter",1.0]])
>>
>>
>>
]
>>
Note that a PDF file is a structured document and not parsable as a string. These specific elements could be compressed, or might occur multiple times for different pages. You'll need a toolkit that can access Pages and Resources and Dictionaries in order to locate the ViewPorts.

Add a text on an existing PDF document by appending something after the PDF content

I would like to "overlay" a text onto an existing PDF document, by appending something at the end of the PDF file (after %%EOF). It is very important that nothing before the %%EOF is modified.
Is it even possible to do this ?
How can I "generate" what to append after %%EOF to do this, for a given text ? The technology doesn't really matter, once I have my "blob" I will just append it myself.
Thanks a lot!
How can I "generate" what to append after %%EOF to do this, for a given text ? The technology doesn't really matter, once I have my "blob" I will just append it myself.
That "blob" to append depends on the PDF to append it to. Essentially you'll have to parse the original PDF and find the page object for the page to overlay. Then you can append a new annotation or content stream with the overlay text, a copy of the page object with a reference to that new annotation or content stream, and a new cross reference section. In general you do that using a PDF library for your preferred programming language.
In a comment to your question you asked for example code to run and see the before/after and reverse-engineer it.
In the following example I use Java and the iText 7 PDF library (current development head but any 7.1.x version should do):
try ( PdfReader pdfReader = new PdfReader(SOURCE_PDF);
PdfWriter pdfWriter = new PdfWriter(TARGET_PDF);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter, new StampingProperties().useAppendMode());
Document document = new Document(pdfDocument)
) {
pdfWriter.setCompressionLevel(0);
Paragraph paragraph = new Paragraph("Hello! This text is added for Fratt");
paragraph
.setWidth(100)
.setBorder(new SolidBorder(new DeviceRgb(0f, 0f, 0.6f), 3))
.setRotationAngle(Math.PI / 4);
Rectangle box = pdfDocument.getFirstPage().getCropBox();
document.showTextAligned(paragraph,
(box.getLeft() + box.getRight()) / 2,
(box.getTop() + box.getBottom()) / 2,
1,
TextAlignment.CENTER,
VerticalAlignment.MIDDLE,
0);
}
(ShowTextAtPosition test testAddCenteredBorderedParagraph)
This adds the following rotated framed text to the first page of the source document:
In case of my example document the following "blob" is added after the original %%EOF:
16 0 obj
<</CreationDate(D:20060808104513+02'00')/Creator(TeX)/ModDate(D:20201221183247+01'00')/PTEX.Fullbanner(This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) kpathsea version 3.5.4)/Producer(pdfeTeX-1.21a; modified using iText® 7.1.14-SNAPSHOT ©2000-2020 iText Group NV \(AGPL-version\))>>
endobj
19 0 obj
<</BaseFont/Helvetica/Encoding/WinAnsiEncoding/Subtype/Type1/Type/Font>>
endobj
1 0 obj
<</Font<</F1 19 0 R/F73 6 0 R/F8 9 0 R>>/ProcSet[/PDF /Text]>>
endobj
2 0 obj
<</Contents[18 0 R 3 0 R 17 0 R]/MediaBox[0 0 595.2756 841.8898]/Parent 10 0 R/Resources 1 0 R/Type/Page>>
endobj
17 0 obj
<</Length 568>>stream
Q
q
0.70711 0.70711 -0.70711 0.70711 358.89 -66.87 cm
q
0 0 0.6 rg
251.62 401.57 m
351.62 401.57 l
354.62 404.57 l
248.62 404.57 l
251.62 401.57 l
f
Q
q
0 0 0.6 rg
351.62 401.57 m
351.62 374.93 l
354.62 371.93 l
354.62 404.57 l
351.62 401.57 l
f
Q
q
0 0 0.6 rg
351.62 374.93 m
251.62 374.93 l
248.62 371.93 l
354.62 371.93 l
351.62 374.93 l
f
Q
q
0 0 0.6 rg
251.62 374.93 m
251.62 401.57 l
248.62 404.57 l
248.62 371.93 l
251.62 374.93 l
f
Q
q
BT
/F1 12 Tf
255.94 391.23 Td
(Hello! This text is)Tj
( )Tj
ET
Q
q
BT
/F1 12 Tf
262.27 377.91 Td
(added for Fratt)Tj
ET
Q
Q
endstream
endobj
18 0 obj
<</Length 2>>stream
q
endstream
endobj
xref
1 2
0000009898 00000 n
0000009976 00000 n
16 4
0000009500 00000 n
0000010098 00000 n
0000010715 00000 n
0000009809 00000 n
trailer
<</ID [<98bc0d0e9347d0a066ab140ebd9ce62c><fa0dda3a13b826a6ecbd129bb048a3d0>]/Info 16 0 R/Prev 9003/Root 15 0 R/Size 20>>
%iText-7.1.14-SNAPSHOT
startxref
10764
%%EOF
Because of the pdfWriter.setCompressionLevel(0) in the code, the content stream is not compressed and you can read and understand it easily.

How itext7 java add multi signature field which using same /AP and /V

I'm new to IText7, and It took me two days to do that.
When generated a pdf have 3 pages, like
...
4 0 obj
<</Contents 5 0 R/MediaBox[0 0 595 842]/Parent 2 0 R/Resources<</Font<</F1 6 0 R>>>>/TrimBox[0 0 595 842]/Type/Page>>
endobj
7 0 obj
<</Contents 8 0 R/MediaBox[0 0 595 842]/Parent 2 0 R/Resources<</Font<</F1 6 0 R>>>>/TrimBox[0 0 595 842]/Type/Page>>
endobj
9 0 obj
<</Contents 10 0 R/MediaBox[0 0 595 842]/Parent 2 0 R/Resources<</Font<</F1 6 0 R>>>>/TrimBox[0 0 595 842]/Type/Page>>
endobj
...
then i signed it, the pdf added some object like this
...
1 0 obj
<</AcroForm 11 0 R/Pages 2 0 R/Type/Catalog>>
endobj
9 0 obj
<</Annots[13 0 R]/Contents 10 0 R/MediaBox[0 0 595 842]/Parent 2 0 R/Resources<</Font<</F1 6 0 R>>>>/TrimBox[0 0 595 842]/Type/Page>>
endobj
11 0 obj
<</Fields[13 0 R]/SigFlags 3>>
endobj
13 0 obj
<</AP<</N 18 0 R>>/F 132/FT/Sig/P 9 0 R/Rect[280.5 810 314.5 842]/Subtype/Widget/T(sig)/V 12 0 R>>
endobj
...
I notice the third page changed, it has an annots(object 13).
Now, i want modify itext code to add /Widget for each page but using same /AP and /V.Then there have same signature graphic in each page but only add signature once. like
...
4 0 obj
<</Annots[13 0 R]/Contents 5 0 R/MediaBox[0 0 595 842]/Parent 2 0 R/Resources<</Font<</F1 6 0 R>>>>/TrimBox[0 0 595 842]/Type/Page>>
endobj
7 0 obj
<</Annots[14 0 R]/Contents 8 0 R/MediaBox[0 0 595 842]/Parent 2 0 R/Resources<</Font<</F1 6 0 R>>>>/TrimBox[0 0 595 842]/Type/Page>>
endobj
9 0 obj
<</Annots[15 0 R]/Contents 10 0 R/MediaBox[0 0 595 842]/Parent 2 0 R/Resources<</Font<</F1 6 0 R>>>>/TrimBox[0 0 595 842]/Type/Page>>
endobj
13 0 obj
<</AP<</N 18 0 R>>/F 132/FT/Sig/P 4 0 R/Rect[280.5 810 314.5 842]/Subtype/Widget/T(sig)/V 12 0 R>>
endobj
14 0 obj
<</AP<</N 18 0 R>>/F 132/FT/Sig/P 7 0 R/Rect[200 400 300 420]/Subtype/Widget/T(sig)/V 12 0 R>>
endobj
15 0 obj
<</AP<</N 18 0 R>>/F 132/FT/Sig/P 9 0 R/Rect[100 200 150 480]/Subtype/Widget/T(sig)/V 12 0 R>>
endobj
11 0 obj
<</Fields[13 0 R 14 0 R 15 0 R]/SigFlags 3>>
endobj
...
I don't know if the example fits. I've read about it before, What I want is a signature with a different /Rect signature graphic on each page after a signature.
How can i do that using iText7 Java?
Can add such a signature multiple times to a PDF and pass signature verification?
After doing so, can Signature ArcoForm and field be deleted and/widgets cleared?
Can I crop Signature Graphic to show different parts on different pages?
Here's some code:
PdfDocument pdfDocument = new PdfDocument(new PdfReader(FILE));
final int pageCount = pdfDocument.getNumberOfPages();
pdfDocument.close();
PdfReader pdfReader = new PdfReader(FILE);
PdfSigner pdfSigner = new PdfSigner(pdfReader, new FileOutputStream(SIGN), new StampingProperties().useAppendMode());
File imageFile = new File(IMAGE);
java.awt.Image image = ImageIO.read(imageFile);
ImageData imageData = ImageDataFactory.create(image, null);
Rectangle rect = new Rectangle(
(pdfDocument.getDefaultPageSize().getRight() / 2) - (imageData.getWidth() / 2),
pdfDocument.getDefaultPageSize().getTop() - imageData.getHeight(),
imageData.getWidth(),
imageData.getHeight()
);
PdfSignatureAppearance appearance = pdfSigner.getSignatureAppearance();
appearance.setPageNumber(pageCount);
appearance.setSignatureGraphic(imageData);
appearance.setRenderingMode(PdfSignatureAppearance.RenderingMode.GRAPHIC);
appearance.setPageRect(rect);
appearance.setReason("reason");
appearance.setLocation("location");
appearance.setReuseAppearance(false);
pdfSigner.setFieldName("sig");
KeyStore ks = KeyStore.getInstance(KeyStore.getDefaultType());
ks.load(new FileInputStream(KEYSTORE), PASSWORD);
String alias = ks.aliases().nextElement();
PrivateKey pk = (PrivateKey) ks.getKey(alias, PASSWORD);
Certificate[] chain = ks.getCertificateChain(alias);
BouncyCastleProvider provider = new BouncyCastleProvider();
Security.addProvider(provider);
IExternalSignature pks = new PrivateKeySignature(pk, DigestAlgorithms.SHA256, provider.getName());
IExternalDigest digest = new BouncyCastleDigest();
pdfSigner.signDetached(digest, pks, chain, null, null, null, 0, PdfSigner.CryptoStandard.CMS);
append:
In com.itextpdf.signatures.PdfSigner.preClose() method, i tried the following code, although successfully adding the desired object, but only the widget annotations that are added first can be shown in Adobe Reader, what should I do? the PDF I got
if (fieldExist) ... else {
PdfDictionary ap = new PdfDictionary();
for (int i = 1; i <= document.getNumberOfPages(); i++) {
PdfWidgetAnnotation widget = new PdfWidgetAnnotation(appearance.getPageRect());
widget.setFlags(PdfAnnotation.PRINT | PdfAnnotation.LOCKED);
PdfSignatureFormField sigField = PdfFormField.createSignature(document);
sigField.setFieldName(name);
sigField.put(PdfName.V, cryptoDictionary.getPdfObject());
sigField.addKid(widget);
if (this.fieldLock != null) {
this.fieldLock.getPdfObject().makeIndirect(document);
sigField.put(PdfName.Lock, this.fieldLock.getPdfObject());
fieldLock = this.fieldLock;
}
widget.setPage(document.getPage(i));
widget.put(PdfName.AP, ap);
if (1 == i)
ap.put(PdfName.N, appearance.getAppearance().getPdfObject());
acroForm.addField(sigField, document.getPage(i));
}
...
}

Visible Signature in a PDF file

I 'm trying to create a visible signature in a PDF file.
Taking a simple PDF "hello world" file:
%PDF-1.7
1 0 obj % entry point
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<<
/Type /Pages
/MediaBox [ 0 0 200 200 ]
/Count 1
/Kids [ 3 0 R ]
>>
endobj
3 0 obj
<<
/Type /Page
/Parent 2 0 R
/Resources <<
/Font <<
/F1 4 0 R
>>
>>
/Contents 5 0 R
>>
endobj
4 0 obj
<<
/Type /Font
/Subtype /Type1
/BaseFont /Times-Roman
>>
endobj
5 0 obj % page content
<<
/Length 44
>>
stream
BT
10 05 TD
/F1 12 Tf
(Hello, world!) Tj
ET
endstream
endobj
xref
0 6
0000000000 65535 f
0000000010 00000 n
0000000079 00000 n
0000000173 00000 n
0000000301 00000 n
0000000380 00000 n
trailer
<<
/Size 6
/Root 1 0 R
>>
startxref
492
%%EOF
And signing it with a text to appear "Yolo" at some position at the first page produces this:
%PDF-1.7
1 0 obj % entry point
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<<
/Type /Pages
/MediaBox [ 0 0 200 200 ]
/Count 1
/Kids [ 3 0 R ]
>>
endobj
3 0 obj
<<
/Type /Page
/Parent 2 0 R
/Resources <<
/Font <<
/F1 4 0 R
>>
>>
/Contents 5 0 R
>>
endobj
4 0 obj
<<
/Type /Font
/Subtype /Type1
/BaseFont /Times-Roman
>>
endobj
5 0 obj % page content
<<
/Length 44
>>
stream
BT
10 05 TD
/F1 12 Tf
(Hello, world!) Tj
ET
endstream
endobj
xref
0 6
0000000000 65535 f
0000000010 00000 n
0000000079 00000 n
0000000173 00000 n
0000000301 00000 n
0000000380 00000 n
trailer
<<
/Size 6
/Root 1 0 R
>>
startxref
492
%%EOF
8 0 obj
<</F 132/Type/Annot/Subtype/Widget/Rect[0 0 0 0]/FT/Sig/DR<<>>/T(Signature1)/V 6 0 R/P 3 0 R/AP<</N 7 0 R>>>>
endobj
6 0 obj
<</Contents <...>/Type/Sig/SubFilter/ETSI.CAdES.detached/M(D:20190626125540+00'00')/ByteRange [0 824 60826 1401]/Filter/Adobe.PPKLite>>
endobj
9 0 obj
<</BaseFont/Helvetica/Type/Font/Subtype/Type1/Encoding/WinAnsiEncoding/Name/Helv>>
endobj
10 0 obj
<</BaseFont/ZapfDingbats/Type/Font/Subtype/Type1/Name/ZaDb>>
endobj
12 0 obj
<</Length 35>>stream
BT
1 15 TD
/Helv 6 Tf
(Yolo) Tj
ET
endstream
endobj
7 0 obj
<</Type/XObject/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/Subtype/Form/BBox[0 0 0 0]/Matrix [1 0 0 1 0 0]/Length 8/FormType 1/Filter/FlateDecode>>stream
xœ
endstream
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Resources<</Font<</F1 4 0 R>>>>/Contents [12 0 R 5 0 R]/Annots[8 0 R]>>
endobj
2 0 obj
<</Type/Pages/MediaBox[0 0 200 200]/Count 1/Kids[3 0 R]>>
endobj
1 0 obj
<</AcroForm<</Fields[8 0 R]/DR<</Font<</Helv 9 0 R/ZaDb 10 0 R>>>>/DA(/Helv 0 Tf 0 g )/SigFlags 3>>/Type/Catalog/Pages 2 0 R>>
endobj
11 0 obj
<</Producer(AdES Tools https://www.turboirc.com)/ModDate(D:20190626125540+00'00')>>
endobj
xref
0 4
0000000000 65535 f
0000061604 00000 n
0000061529 00000 n
0000061414 00000 n
6 7
0000000804 00000 n
0000000000 65535 f
0000000679 00000 n
0000060952 00000 n
0000061050 00000 n
0000061746 00000 n
0000061127 00000 n
trailer
<</Root 1 0 R/Prev 492/Info 11 0 R/Size 17/ID[<4BB225C2F629BB21464F66FBF2FED264><8E3C9AD8354C66931EAAC282088455EA>]>>
startxref
61846
%%EOF
So there is an object in the PDF that shows some text in the first page:
12 0 obj
<</Length 35>>stream
BT
1 15 TD
/Helv 6 Tf
(Yolo) Tj
ET
endstream
endobj
My problem is now that this object is treated like a common text object in adobe reader. I want it, when clicked, to go to the digital signature, like how Adobe Acrobat signs the documents.
What do I miss? Is there a parameter in the digital signature (The 6 or 8 number object) or in any of the other objects my app puts in the new PDF that links the text object with the signature?
Thanks a lot.
Your object 8
8 0 obj
<</F 132/Type/Annot/Subtype/Widget/Rect[0 0 0 0]/FT/Sig/DR<<>>/T(Signature1)/V 6 0 R/P 3 0 R/AP<</N 7 0 R>>>>
endobj
is an AcroForm form field for signatures (as the FT entry with value Sig tells us). At the same time, though, this object also is a form field widget annotation (as can be seen in the Type and Subtype entries). Form field widget annotations are the visual representations of form fields, and if a form field has only one representation, the widget can be merged with the form field as in your object.
In your case the annotation has a 0x0 size (/Rect[0 0 0 0]), i.e. invisible. To have a visible representation, you need an annotation rectangle that does not vanish.
The content that is displayed is defined in the normal appearance /AP<</N 7 0 R>> which points to object 7.
7 0 obj
<</Type/XObject/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/Subtype/Form/BBox[0 0 0 0]/Matrix [1 0 0 1 0 0]/Length 8/FormType 1/Filter/FlateDecode>>stream
xœ
endstream
endobj
At first glance this looks pretty empty, even after decompression.
Thus, what you have to do is
choose a non-vanishing rectangle for your signature form field annotation,
adapt the BBox of the normal appearance stream to that annotation rectangle, and
create a non-empty content in the normal appearance stream of that annotation instead of adding page content.
Furthermore you should fix obvious errors in your PDF, e.g.
object 7, your signature field normal appearance, is marked as free in your cross references
your trailer claims a size of 17
For details please study the PDF specification ISO 32000. Part 1 is published for download by Adobe at https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
In particular sections
12.5 "Annotations"
12.7 "Interactive Forms"
12.8 "Digital Signatures"

Open PDF, save DOCX bugs out after a few dozen documents and outputs garbled/corrupted files

I have a few thousand PDF files that I needs to convert to DOCX. I wrote the following macro:
Sub convertPDFtoDOCX()
'
' convertPDFtoDOCX Macro
'
'
Dim docDirectory As String
Dim pdfDirectory As String
Dim docPath As String
Dim doc As Document
docDirectory = "C:\Users\<USER>\DOCX\"
pdfDirectory = "C:\Users\<USER>\PDF\"
pdfFile = Dir(pdfDirectory & "*.*")
Do While pdfFile <> ""
docPath = docDirectory & pdfFile & ".docx"
Set doc = Documents.Open(FileName:=pdfDirectory & pdfFile)
ActiveDocument.SaveAs2 FileName:=docPath, FileFormat:=wdFormatXMLDocument
Documents.Close
pdfFile = Dir
Loop
End Sub
It works fine for the first few dozen documents, but then starts outputting "corrupted files", that aren't docx and can't be opened with a PDF viewer either. There is no error message when it starts bugging out. The problem doesn't come from the PDF files, since if I stop the macro and start it again on the same documents, they are correctly converted the second time.
"Corrupted" files looks like this:
%PDF-1.5
%µµµµ
1 0 obj
<</Type/Catalog/Pages 2 0 R/Lang(fr-FR) /StructTreeRoot 91 0 R/MarkInfo<</Marked true>>>>
endobj
2 0 obj
<</Type/Pages/Count 21/Kids[ 3 0 R 27 0 R 31 0 R 42 0 R 44 0 R 46 0 R 48 0 R 55 0 R 59 0 R 61 0 R 63 0 R 65 0 R 67 0 R 69 0 R 71 0 R 73 0 R 75 0 R 77 0 R 79 0 R 81 0 R 88 0 R] >>
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Resources<</Font<</F1 5 0 R/F2 9 0 R/F3 11 0 R/F4 16 0 R/F5 18 0 R/F6 20 0 R/F7 25 0 R>>/ExtGState<</GS7 7 0 R/GS8 8 0 R>>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 595.2 841.8] /Contents 4 0 R/Group<</Type/Group/S/Transparency/CS/DeviceRGB>>/Tabs/S/StructParents 0>>
endobj
4 0 obj
<</Filter/FlateDecode/Length 4428>>
stream
xœ­\Ën7Ýð?Ô.Ý ¨Ä7«‚ ¹%e4ð+²’Y$Yt¤¶£A,9RÛÈüÕ|Æ|ÆìÙäæ^²ÈzðQ-¦ È]U¼$//:<yØÞ¾__o«££Ã“ív}ýóæ¦úþðÅýv{ÿñÇë}Ú¾]¸½[ooïï
What causes the issue and how can I fix it?
I use Word 2016 on Windows 10.
I don't think you can fix the issue without a patch from Microsoft. Meanwhile, you can move your code to run outside Word and create a new Word.Application object for each iteration.