After content stream recreation (using PDFbox) getting error in axesPDF (insert spaces)?

After content stream recreation (using PDFbox) getting error in axesPDF (insert spaces)? - pdf

when I am recreating a pdf after some changes , if I use the output pdf in axesPDF to fix spaces issue I am getting "ERROR". The only difference I observed my input pdf have Array of tokens but my recreated pdf has Dictionary of elements. As shown in below. Does that cause the problem? How can I recreate similar structure? (Left one is input pdf right one output pdf after editing)
Input pdf
output pdf with changes
The code I am using to save the pdf is
PDStream newContents = new PDStream(document);
OutputStream out = newContents.createOutputStream(COSName.FLATE_DECODE);
ContentStreamWriter writer = new ContentStreamWriter(out);
writer.writeTokens(tokens);
out.close();
document.getPage(pg_ind).setContents(newContents);
newPDF.addPage(document.getPage(pg_ind));
newPDF.save()
Please help me on this. Thanks in advance.
Updating question along with error.
Another Input file
The error is
The button I used.
I am wondering this time even the content stream is in COSDictionary format it's giving error. Something else causing this.

Related

Filter out anything but interactive form fields in PDF's

I'm looking for a way to filter out all objects apart from interactive form fields in PDF files.
The programming language isn't too important, but it would would love if I could do it from the Linux command line but I'm pretty much open to anything.
E.g. choose an pdf input file, and output a new pdf file with only the interactive form fields from the first.
The ultimate goal is to be able to take an already printed but unfilled form , and print only the content of the filled in form fields onto it.
The closest I've gotten is by using ghostscript:
gs -o outfile.pdf -sDEVICE=pdfwrite -dFILTERTEXT -dFILTERIMAGE infile.pdf
But that still leaves a lot of lines in my case, as well as an image despite -dFILTERIMAGE.
There's also a -dFILTERVECTOR-option but sadly it removes the formfields as well.

I'm looking for a way to filter out all objects apart from interactive form fields in PDF files.
First and foremost you have to get rid of the static page content. Using an arbitrary general purpose pdf library you can do that by clearing the contents entry of every page.
E.g. using the Java version of iText7 this can be done as follows:
try (
PdfReader pdfReader = new PdfReader(SOURCE);
PdfWriter pdfWriter = new PdfWriter(RESULT);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter)
) {
for (int pageNr = 1; pageNr <= pdfDocument.getNumberOfPages(); pageNr++) {
PdfPage pdfPage = pdfDocument.getPage(pageNr);
pdfPage.getPdfObject().remove(PdfName.Contents);
pdfPage.getPdfObject().setModified();
}
}
(RemoveContent test testRemoveAllPageContentStreams)

PDFBox: Fill out a PDF with adding repeatively a one-page template containing a form

Following SO question Java pdfBox: Fill out pdf form, append it to pddocument, and repeat I had trouble appending a cloned page to a new PDF.
Code from this page seemed really interesting, but didn't work for me.
Actually, the answer doesn't work because this is the same PDField you always modify and add to the list. So the next time you call 'getField' with initial name, it won't find it and you get an NPE. I tried with the same pdfbox version used (1.8.12) in the nice github project, but can't understand how he gets this working.
I had the same issue today trying to append a form on pages with different values in it. I was wondering if the solution was not to duplicate field, but can't succeed to do it properly. I always end with a PDF containing same values for each form.
(I provided a link to the template document for Mkl, but now I removed it because it doesn't belong to me)
Edit: Following Mkl's advices, I figured it out what I was missing, but performances are really bad with duplicating every pages. File size isn't satisfying. Maybe there's a way to optimize this, reusing similar parts in the PDF.

Finally I got it working without reloading the template each time. So the resulting file is as I wanted: not too big (4Mb for 164 pages).
I think I did 2 mistakes before: one on page creation, and probably one on field duplication.
So here is the working code, if someone happens to be stuck on the same problem.
Form creation:
PDAcroForm finalForm = new PDAcroForm(finalDoc, new COSDictionary());
finalForm.setDefaultResources(originForm.getDefaultResources())
Page creation:
PDPage clonedPage = templateDocument.getPage(0);
COSDictionary clonedDict = new COSDictionary(clonedPage.getCOSObject());
clonedDict.removeItem(COSName.ANNOTS);
clonedPage = new PDPage(clonedDict);
finalDoc.addPage(clonedPage);
Field duplication: (rename field to become unique and set value)
PDTextField field = (PDTextField) originForm.getField(fieldName);
PDPage page = finalDoc.getPages().get(nPage);
PDTextField clonedField = new PDTextField(finalForm);
List<PDAnnotationWidget> widgetList = new ArrayList<>();
for (PDAnnotationWidget paw : field.getWidgets()) {
PDAnnotationWidget newWidget = new PDAnnotationWidget();
newWidget.getCOSObject().setString(COSName.DA, paw.getCOSObject().getString(COSName.DA));
newWidget.setRectangle(paw.getRectangle());
widgetList.add(newWidget);
}
clonedField.setQ(field.getQ()); // To get text centered
clonedField.setWidgets(widgetList);
clonedField.setValue(value);
clonedField.setPartialName(fieldName + cnt++);
fields.add(clonedField);
page.getAnnotations().addAll(clonedField.getWidgets());
And at the end of the process:
finalDoc.getDocumentCatalog().setAcroForm(finalForm);
finalForm.setFields(fields);
finalForm.flatten();

Docx4J: Vertical text frame not exported to PDF

I'm using Docx4J to make an invoice model.
In the left-side of the page, it's usual to show a legal sentence as: Registered company in ... Book ... Page ...
I have inserted this in my template with a Word text frame.
Well, my issue is: when exporting to .docx, this legal text is shown perfect, but when exporting to .pdf, it's shown as an horizontal table under the other data.
The code to export to PDF is:
FOSettings foSettings = Docx4J.createFOSettings();
foSettings.setFoDumpFile(foDumpFile);
foSettings.setWmlPackage(template);
fos = new FileOutputStream(new File("/C:/mypath/prueba_OUT.pdf"));
Docx4J.toFO(foSettings, fos, Docx4J.FLAG_EXPORT_PREFER_XSL);
Any help would be very appreciated.
Thanks.

You'd need to extend the PDF via FO code; see further How to correctly position a header image with docx4j?
Float left may or may not be easy; similarly the rotated text.
In general, the way to work on this is to take the FO generated by docx4j, then hand edit it to something which FOP can convert to a PDF you are happy with. If you can do that, then its a matter of modifying docx4j to generate that FO.

Table of Contents (TOC) missing after using CGContextDrawPDFPage

i am cocoa programer and using Quartz to draw pdf files, the original pdf has table of contents (TOC), but the result pdf lost TOC after using following functions.
for(int i = 1; i <= pageCount; i++)
{
page = CGPDFDocumentGetPage (document, i);
CGContextDrawPDFPage (myContext, page);
}
Am I doing wrong or how to keep TOC with Quartz? Any help would be appreciated. (english is not my native language, hope you can understand what i am asking...)

Your code takes the pages content from the source file and draws them on a new document. This is the only content you can transfer from one document to another. The bookmarks (TOC), form fields, annotations, links in the source file cannot be copied to the new document. It is a limitation of the CoreGraphics API.
So if you need to modify an existing PDF file you're out of luck.

Can you insert blank lines in an already transformed PDF?

I have a situation where I need to increase the space between a table and the header on a PDF that has already been transformed from an XSL template.
I need to insert an address in the newly created space. This part is easy enough and I can do that using a stamper and a new table.
However, I am struggling to find a solution to move the grid down to make the space.
Basically I am using FOP to create the PDF from an XSL template using code similar to the following:
OutputStream out = new java.io.FileOutputStream(pdf);
Driver driver = new Driver();
driver.setRenderer(Driver.RENDER_PDF);
driver.setOutputStream(out);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(new StreamSource(xsl));
StringReader xmlStream = new StringReader(xmlData);
Source xmlSource = new StreamSource(xmlStream);
Result res = new SAXResult(driver.getContentHandler());
transformer.transform(xmlSource, res);
Is it even possible to access the PDF in a way to add the new space? If so, what are my options? I should mention that I don’t know at the time the transformation is happening that I will need the extra space. I only know I need it once I get a page count of the PDF.
Any help is greatly appreciated!

It's not possible to add "new space" per se, but it is possible to get the co-ordinates of an object on the page and then re-draw that object somewhere else. Unfortunately there's no quick and easy solution and you will need a third-party SDK to do this.
PDF isn't a word processor format, so it's not possible to simply add a couple of carriage returns, as you might in MS Word.
Try iText, it's written in Java and has a decent amount of functionality for manipulating PDFs.

It seems it is not possible (at least from everything I have tried and read) to move transformed objects around once the PDF has already been generated.
Since I was already using iText and the PdfStamper class I was able to insert a new page and insert a new table with the current address info. I did this with the following code:
//add new page
PdfStamper stamper = new PdfStamper(reader,new FileOutputStream(file);
stamper.insertPage(pageNumber,reader.getPageSizeWithRotation(1));
//add new table with data
BaseFont base = BaseFont.createFont(BaseFont.HELVETICA,"",BaseFont.NOT_EMBEDDED);
over.setFontAndSize(base,fontSize);
PdfPTable table = new PdfPTable(1);
table.getDefaultCell().setBorder(Rectangle.NO_BORDER);
table.addCell(data);
table.setTotalWidth(150f);
table.writeSelectedRows(0, -1, 73, 650, over);
This is not the answer to my question by a viable solution I thought I would share in case others get hung up on the same issue.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

After content stream recreation (using PDFbox) getting error in axesPDF (insert spaces)? - pdf

Related

Filter out anything but interactive form fields in PDF's

PDFBox: Fill out a PDF with adding repeatively a one-page template containing a form

Docx4J: Vertical text frame not exported to PDF

Table of Contents (TOC) missing after using CGContextDrawPDFPage

Can you insert blank lines in an already transformed PDF?

Categories

Resources