I need to be able to create PDFs with printable diplomas for sports events (10K runs etc).
A graphic designer creates a beautiful diploma with placeholder texts (name of participant, finish time) - and I need to get from that to a finished PDF (on the fly), which the participant can download.
What output should I get from the designer (file format, prepared in any special way)?
How do I take that file, fill in data and generate PDF?
How can this be accomplished using IText?
I have done a lot of generating PDFs from HTML and Word docs, but this is something new to me, so I am can't figure out where to start.
My best idea right now is to have the designer export as PDF without placeholder text, but with x/y coordinates and font on where to input name, time etc... But I would prefer to not have to store the x/y coordinates, font etc - and just be able to fill in a "template"...
There are several possibilites e.g:
Let your designer create a diploma PDF
Add form fields at the places you want to add name and event etc. This can be done by you or the designer or a PDF lib like openPdf / PdfBox / iText
Fill in the data using openPdf / PdfBox / iText and afterwards make the field readonly
You could even sign the PDF afterwards (and thus "protect" it from changes)
OR
2a) You could also add text to an existing PDF but this is a bit trickier since you need to know the coordinates and need to care about length issues etc.
Related
I want to create a editable pdf, using Acroform Technology, with the following features, so I would like to know if is it possible before I start diving into learning it:
1 - I want that the javascript script can read some small database (I mean, data hidden inside de pdf document). So, for example, if the user write in a textbox "Silvia", javascript can brings the Silvia's sallary that is in a sallary table hidden inside the pdf.
2 - From pushing a button, javascript script can export to a text file some data that is inside the pdf (from form elements and databases inside de pdf).
Is both features possible using Acroform?
Feature 1 is no problem; you may work with arrays, and then convert them to strings you can put in (hidden) fields for saving and reopening the document.
Feature 2 will most likely require Acrobat (Pro), as there are still some limitations in Reader.
I have a visual basic program that extracts text from a PDF and imports the text into excel. It relies on reading the text like a human, reading left to right across the page. However, there are instances on this particular PDF where if I go to select the text with my mouse, I click and drag straight across but Adobe starts to select/highlight words on the above and below lines before continuing to highlight across the page. This gives me data that I do not want/need. The page has renderable text and is not from a scanned document.
Is there a way to "reset" the way Adobe interprets the text on the PDF? Since the information on the left is far from the information on the right, it treats them almost like separate columns.
I've tried saving the PDF in different formats such as a txt or postscript and distilling to another PDF but they all seem to result in the same outcome. This is weird to me because I have other similar PDFs where this isn't an issue.
Any help or thoughts would be greatly appreciated, thanks.
As PDF (in its basic form) essentially means placing strings on a canvas, the concept of "sentence" or "reading order" is not built in.
In order to extract text, you would have to read out the bounding box of the piece of text, and then use some logic and heuristics to assemble your text based on the coordinates of the bounding box.
Things can be easier if the PDF is a structured PDF, where the text contents is embedded as text in the document. This is also the prime requirement for an accessible document. So, if your document is accessible, you can rely on the structure for the correct reading order.
I have a lengthy PDF time tracking document that was printed out and used in a paper process to schedule appointments. Now this paper process is being converted to an online application and this application needs to generate reports in the same format as the PDF document (this time programatically inserting values into rows instead of having someone write them on the piece of paper).
My question is this, is possible to somehow import the layout of that PDF document into Telerik reporter's designer? Otherwise, is there some sort of an intermediary tool that I can use to make the layout more exportable?
Just to clarify, I am not trying to save my reports as PDF but trying to use a given PDF's layout to create a similar looking report in Telerik.
Any tips would be very welcome.
Thank you very much!
There are numerous tools for extracting text or images from pdf files, but I am pretty sure nothing exists to extract the layout of a pdf. The pdf format is just text and symbols with coordinates. There is no layout to extract.
I need to programmatically embed an existing PDF (a small graphic) onto a specfic page on an existing PDF. Using iTextSharp I've been able to add a new page containing this embedded PDF, but what need is to modify an existing page by adding this graphic. Is this possible using iTextSharp or any other PDF-generation libarary?
I tend to do this sort of thing using Context, which is a Tex-based layout tool that in integrated into the pdftex Tex/Metapost engine. There's a learning curve involved, and installing Context isn't entirely trivial, but it makes very general programmatic document processing involving PDFs easy once you get the hang of it.
For this problem, you'd define two overlays, with the first overlay being the main PDF that you set to a background, and then on the page you want to change, defining a foreground overlay with a \setlayer command, which contains a single \framed box, which superimposes the second PDF using a \externalfigure command.
The nice thing about Context for this kind of task is that it works with PDF as its internal representation all the way through, so there is no unexpected blow up in file size or deterioration in image quality, which you can get with other tools that convert between formats.
I'm trying to build a web app to programmatically fill out a PDF form. I am going to configure my form first in Adobe Acrobat, then write a Java app with iText to fill out all the form fields via user input from the web. The base form I need to fill out comes from the US government. They created form fields with extremely large kerning (character spacing) values I need to change. However, there appears to be no way to modify this value in the Acrobat UI.
Does anyone know how to manipulate character spacing on form fields in Acrobat 8.0 for Windows? I could try to use iText to programmatically manipulate the kerning of the original document, but this would be much more tedious.
I believe I figured this out: kerning is called "combing" in acrobat, and each of the form fields have been "combed". The strange thing is this option isn't checked when I view the properties of the form field, but "combing" is the behaviour I was attempting to replicate.