PDFBox inserting data into multi-page doc

PDFBox inserting data into multi-page doc - pdf

I'm trying to create a simple pdf multi-page document with fields in it. To do that I have a template pdf which in the code I clone this template as many times as needed in order to create the document itself.
The problem comes with inserting some data in it. The type of data I try to insert to the document is not supposed to change across the pages. Rather than that, it stays static in all pages, Like the "Pages" digit that represents the number of the pages that this document contains.
Now, Inside my template pdf I have some text fields like, for instance, "Shipper1" and "Pages". I want to be able to insert my data into this text fields so that all the pages in the document will have this values in their "Shipper1" and "Pages" fields.
My code currently does that only on the first page. It shows the data perfectly. On the other hand, when I go to another page, the data isn't shown there. It's just displays an empty field.
Here is the code where I initiate the pdf document:
static void initiatePdf() {
// Initiate a new PDF Box object and get the acro form from it
File file = new File(Constants.Paths.EMPTY_DOC)
PDDocument tempDoc
Evaluator evaluator = new Evaluator(metaHolder)
int numPages = evaluator.getNumOfPagesRequired(objects)
FieldRenamer renamer = new FieldRenamer()
PDResources res = new PDResources()
COSDictionary acroFormDict = new COSDictionary()
List<PDField> fields = []
Closure isFieldExist = {List<PDField> elements, String fieldName ->
elements.findAll{it.getFullyQualifiedName() == fieldName}.size() > 0
}
for(int i = 0; i < numPages; i++) {
tempDoc = new PDDocument().load(file)
PDDocumentCatalog docCatalog = tempDoc.getDocumentCatalog()
PDAcroForm acroForm = docCatalog.acroForm
PDPage page = (PDPage) docCatalog.getPages().get(0)
renamer.setCurrentForm(acroForm)
if(i == 0) {
res = acroForm.getDefaultResources()
acroFormDict.mergeInto(acroForm.getCOSObject())
renamer.renameFields(1)
} else
renamer.renameFields(i*10+1)
List<PDField> newFields = acroForm.fields.findAll { PDField newField ->
isFieldExist(fields, newField.getFullyQualifiedName()) == false
}
fields.addAll(newFields)
document.addPage(page)
}
PDAcroForm acroForm = new PDAcroForm(document, acroFormDict);
acroForm.setFields(fields)
acroForm.setDefaultResources(res);
document.documentCatalog.setAcroForm(acroForm)
}
A couple of things first:
metaHolder instance holds the information about all
the fields that reside inside the acro form. the info is: Field Name, Field Widget Width, Field Font and Font size
evaluator is just and instance of the Evaluator class. Its purpose is to analyze the dynamic data and decide how many pages will take to contain all that text data.
Here is where I try to populate the fields with text:
static void populateData() {
def properties = ["$Constants.Fields.SHIPPER" : "David"]
FieldPopulater populater = new FieldPopulater(document, metaHolder)
populater.populateStaticFields(properties)
}
FieldPopulater class:
package app.components
import app.StringUtils
import app.components.entities.DGObject
import app.components.entities.FieldMeta
import org.apache.pdfbox.pdmodel.PDDocument
import org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm
import org.apache.pdfbox.pdmodel.interactive.form.PDField
/**
* Created by David on 18/10/2016.
*/
class FieldPopulater {
PDAcroForm acroForm
FormMetaHolder metaHolder
FieldPopulater(PDDocument document, FormMetaHolder metaHolder) {
this.acroForm = document.getDocumentCatalog().acroForm
this.metaHolder = metaHolder
}
void populateStaticFields(properties) {
List<PDField> fields = []
properties.each {fieldName, data ->
FieldMeta fieldMeta = metaHolder.getMetaData(fieldName)
fields = acroForm.fields.findAll { PDField field ->
String currentName = field.getFullyQualifiedName()
char lastChar = currentName[-1]
if(Character.isDigit(lastChar)) {
currentName = currentName.substring(0,currentName.size()-1)
}
currentName == fieldName
}
if(fields.size() > 1) {
int counter = 1
String tempData = data
String currentFitData
while(tempData.isEmpty() != true) {
int maxWords = Utils.getMaxWords(tempData, fieldMeta)
currentFitData = StringUtils.getTextByWords(tempData, maxWords)
tempData = StringUtils.chopTextByWords(tempData, maxWords)
PDField field = fields.find{it.getFullyQualifiedName()[-1] == "$counter"}
field?.setValue(currentFitData)
counter++
}
} else {
PDField tempField = fields[0]
tempField.setValue(data)
}
}
}
}
The result is, in the first page, the field "Shipper" has a value of "David"
In the second page, the field "Shipper" is empty.
Here is an image. First page:
Second page:
What is the problem here?
UPDATE: I tried to add the widgets of every new acro form to the current page so that every field will a few kids widgets that will represent the field, but it still doesn't work.
// All the widgets that are associated with the fields
List<PDAnnotationWidget> widgets = acroForm.fields.collect {PDField field -> field.getWidgets().get(0)}
page.annotations.addAll(widgets)
UPDATE: I also tried to add the current widget of a field to the parent field's collection of widgets. Here is the code:
List<PDAnnotationWidget> widgets = []
// All the widgets that are associated with the fields
acroForm.fields.each {PDField field ->
PDAnnotationWidget widget = field.widgets.get(0)
// Adding the following widget to the page and to the field's list of annotation widgets
widgets.add(widget)
fields.find {it.getFullyQualifiedName() == field.getFullyQualifiedName()}?.widgets.add(widget)
}
page.annotations.addAll(widgets)

What you want is to have sereval visual representations of the same field. This is done by having several annotation widgets for such a field.
In PDF, when a field has only one annotation widget, they share a common dictionary. When it has several, the annotation widgets are in a child list of the field.
When you want several annotation widgets for one field, you need to create the annotation widgets with new PDAnnotationWidget() instead of calling field.getWidgets().get(0) and using that one. These widgets must be added to a list, and this list must be assigned to the field with setWidgets(). And for each widget you must call setRectangle() and setPage() and setParent().
An example for this is in the new CreateMultiWidgetsForm.java example. The setParent() method is not yet available in 2.0.3 (but will be in 2.0.4). In this answer, it is replaced with a call that does the same thing in a less elegant way.
public final class CreateMultiWidgetsForm
{
private CreateMultiWidgetsForm()
{
}
public static void main(String[] args) throws IOException
{
// Create a new document with 2 empty pages.
PDDocument document = new PDDocument();
PDPage page1 = new PDPage(PDRectangle.A4);
document.addPage(page1);
PDPage page2 = new PDPage(PDRectangle.A4);
document.addPage(page2);
// Adobe Acrobat uses Helvetica as a default font and
// stores that under the name '/Helv' in the resources dictionary
PDFont font = PDType1Font.HELVETICA;
PDResources resources = new PDResources();
resources.put(COSName.getPDFName("Helv"), font);
// Add a new AcroForm and add that to the document
PDAcroForm acroForm = new PDAcroForm(document);
document.getDocumentCatalog().setAcroForm(acroForm);
// Add and set the resources and default appearance at the form level
acroForm.setDefaultResources(resources);
// Acrobat sets the font size on the form level to be
// auto sized as default. This is done by setting the font size to '0'
String defaultAppearanceString = "/Helv 0 Tf 0 g";
acroForm.setDefaultAppearance(defaultAppearanceString);
// Add a form field to the form.
PDTextField textBox = new PDTextField(acroForm);
textBox.setPartialName("SampleField");
// Acrobat sets the font size to 12 as default
// This is done by setting the font size to '12' on the
// field level.
// The text color is set to blue in this example.
// To use black, replace "0 0 1 rg" with "0 0 0 rg" or "0 g".
defaultAppearanceString = "/Helv 12 Tf 0 0 1 rg";
textBox.setDefaultAppearance(defaultAppearanceString);
// add the field to the AcroForm
acroForm.getFields().add(textBox);
// Specify 1st annotation associated with the field
PDAnnotationWidget widget1 = new PDAnnotationWidget();
PDRectangle rect = new PDRectangle(50, 750, 250, 50);
widget1.setRectangle(rect);
widget1.setPage(page1);
widget1.getCOSObject().setItem(COSName.PARENT, textBox);
// Specify 2nd annotation associated with the field
PDAnnotationWidget widget2 = new PDAnnotationWidget();
PDRectangle rect2 = new PDRectangle(200, 650, 100, 50);
widget2.setRectangle(rect2);
widget2.setPage(page2);
widget2.getCOSObject().setItem(COSName.PARENT, textBox);
// set green border and yellow background for 1st widget
// if you prefer defaults, just delete this code block
PDAppearanceCharacteristicsDictionary fieldAppearance1
= new PDAppearanceCharacteristicsDictionary(new COSDictionary());
fieldAppearance1.setBorderColour(new PDColor(new float[]{0,1,0}, PDDeviceRGB.INSTANCE));
fieldAppearance1.setBackground(new PDColor(new float[]{1,1,0}, PDDeviceRGB.INSTANCE));
widget1.setAppearanceCharacteristics(fieldAppearance1);
// set red border and green background for 2nd widget
// if you prefer defaults, just delete this code block
PDAppearanceCharacteristicsDictionary fieldAppearance2
= new PDAppearanceCharacteristicsDictionary(new COSDictionary());
fieldAppearance2.setBorderColour(new PDColor(new float[]{1,0,0}, PDDeviceRGB.INSTANCE));
fieldAppearance2.setBackground(new PDColor(new float[]{0,1,0}, PDDeviceRGB.INSTANCE));
widget2.setAppearanceCharacteristics(fieldAppearance2);
List <PDAnnotationWidget> widgets = new ArrayList<PDAnnotationWidget>();
widgets.add(widget1);
widgets.add(widget2);
textBox.setWidgets(widgets);
// make sure the annotations are visible on screen and paper
widget1.setPrinted(true);
widget2.setPrinted(true);
// Add the annotations to the pages
page1.getAnnotations().add(widget1);
page2.getAnnotations().add(widget2);
// set the field value
textBox.setValue("Sample field");
document.save("MultiWidgetsForm.pdf");
document.close();
}
}

Related

Printing to pdf from Google Apps Script HtmlOutput

For years, I have been using Google Cloud Print to print labels in our laboratories on campus (to standardize) using a Google Apps Script custom HtmlService form.
Now that GCP is becoming depreciated, I am in on a search for a solution. I have found a few options but am struggling to get the file to convert to a pdf as would be needed with these other vendors.
Currently, when you submit a text/html blob to the GCP servers in GAS, the backend converts the blob to application/pdf (as evidenced by looking at the job details in the GCP panel on Chrome under 'content type').
That said, because these other cloud print services require pdf printing, I have tried for some time now to have GAS change the file to pdf format before sending to GCP and I always get a strange result. Below, I'll show some of the strategies that I have used and include pictures of one of our simple labels generated with the different functions.
The following is the base code for the ticket and payload that has worked for years with GCP
//BUILD PRINT JOB FOR NARROW TAPES
var ticket = {
version: "1.0",
print: {
color: {
type: "STANDARD_COLOR",
vendor_id: "Color"
},
duplex: {
type: "NO_DUPLEX"
},
copies: {copies: parseFloat(quantity)},
media_size: {
width_microns: 27940,
height_microns:40960
},
page_orientation: {
type: "LANDSCAPE"
},
margins: {
top_microns:0,
bottom_microns:0,
left_microns:0,
right_microns:0
},
page_range: {
interval:
[{start:1,
end:1}]
},
}
};
var payload = {
"printerid" : QL710,
"title" : "Blank Template Label",
"content" : HtmlService.createHtmlOutput(html).getBlob(),
"contentType": 'text/html',
"ticket" : JSON.stringify(ticket)
};
This generates the expected following printout:
When trying to convert to pdf using the following code:
The following is the code used to transform to pdf:
var blob = HtmlService.createTemplate(html).evaluate().getContent();
var newBlob = Utilities.newBlob(html, "text/html", "text.html");
var pdf = newBlob.getAs("application/pdf").setName('tempfile');
var file = DriveApp.getFolderById("FOLDER ID").createFile(pdf);
var payload = {
"printerid" : QL710,
"title" : "Blank Template Label",
"content" : pdf,//HtmlService.createHtmlOutput(html).getBlob(),
"contentType": 'text/html',
"ticket" : JSON.stringify(ticket)
};
an unexpected result occurs:
This comes out the same way for direct coding in the 'content' field with and without .getBlob():
"content" : HtmlService.createHtmlOutput(html).getAs('application/pdf'),
note the createFile line in the code above used to test the pdf. This file is created as expected, of course with the wrong dimensions for label printing (not sure how to convert to pdf with the appropriate margins and page size?): see below
I have now tried to adopt Yuri's ideas; however, the conversion from html to document loses formatting.
var blob = HtmlService.createHtmlOutput(html).getBlob();
var docID = Drive.Files.insert({title: 'temp-label'}, blob, {convert: true}).id
var file = DocumentApp.openById(docID);
file.getBody().setMarginBottom(0).setMarginLeft(0).setMarginRight(0).setMarginTop(0).setPageHeight(79.2).setPageWidth(172.8);
This produces a document looks like this (picture also showing expected output in my hand).
Does anyone have insights into:
How to format the converted pdf to contain appropriate height, width
and margins.
How to convert to pdf in a way that would print correctly.
Here is a minimal code to get a better sense of context https://script.google.com/d/1yP3Jyr_r_FIlt6_aGj_zIf7HnVGEOPBKI0MpjEGHRFAWztGzcWKCJrD0/edit?usp=sharing

I've made the template (80 x 40 mm -- sorry, I don't know your size):
https://docs.google.com/document/d/1vA93FxGXcWLIEZBuQwec0n23cWGddyLoey-h0WR9weY/edit?usp=sharing
And there is the script:
function myFunction() {
// input data
var matName = '<b>testing this to <u>see</u></b> if it <i>actually</i> works <i>e.coli</i>'
var disposeWeek = 'end of semester'
var prepper = 'John Ruppert';
var className = 'Cell and <b>Molecular</b> Biology <u>Fall 2020</u> a few exercises a few exercises a few exercises a few exercises';
var hazards = 'Lots of hazards';
// make a temporary Doc from the template
var copyFile = DriveApp.getFileById('1vA93FxGXcWLIEZBuQwec0n23cWGddyLoey-h0WR9weY').makeCopy();
var doc = DocumentApp.openById(copyFile.getId());
var body = doc.getBody();
// replace placeholders with data
body.replaceText('{matName}', matName);
body.replaceText('{disposeWeek}', disposeWeek);
body.replaceText('{prepper}', prepper);
body.replaceText('{className}', className);
body.replaceText('{hazards}', hazards);
// make Italics, Bold and Underline
handle_tags(['<i>', '</i>'], body);
handle_tags(['<b>', '</b>'], body);
handle_tags(['<u>', '</u>'], body);
// save the temporary Doc
doc.saveAndClose();
// make a PDF
var docblob = doc.getBlob().setName('Label.pdf');
DriveApp.createFile(docblob);
// delete the temporary Doc
copyFile.setTrashed(true);
}
// this function applies formatting to text inside the tags
function handle_tags(tags, body) {
var start_tag = tags[0].toLowerCase();
var end_tag = tags[1].toLowerCase();
var found = body.findText(start_tag);
while (found) {
var elem = found.getElement();
var start = found.getEndOffsetInclusive();
var end = body.findText(end_tag, found).getStartOffset()-1;
switch (start_tag) {
case '<b>': elem.setBold(start, end, true); break;
case '<i>': elem.setItalic(start, end, true); break;
case '<u>': elem.setUnderline(start, end, true); break;
}
found = body.findText(start_tag, found);
}
body.replaceText(start_tag, ''); // remove tags
body.replaceText(end_tag, '');
}
The script just changes the {placeholders} with the data and saves the result as a PDF file (Label.pdf). The PDF looks like this:
There is one thing, I'm not sure if it's possible -- to change a size of the texts dynamically to fit them into the cells, like it's done in your 'autosize.html'. Roughly, you can take a length of the text in the cell and, in case it is bigger than some number, to make the font size a bit smaller. Probably you can use the jquery texfill function from the 'autosize.html' to get an optimal size and apply the size in the document.

I'm not sure if I got you right. Do you need make PDF and save it on Google Drive? You can do in Google Docs.
As example:
Make a new document with your table and text. Something like this
Add this script into your doc:
function myFunction() {
var copyFile = DriveApp.getFileById(ID).makeCopy();
var newFile = DriveApp.createFile(copyFile.getAs('application/pdf'));
newFile.setName('label');
copyFile.setTrashed(true);
}
Every time you run this script it makes the file 'label.pdf' on your Google Drive.
The size of this pdf will be the same as the page size of your Doc. You can make any size of page with add-on: Page Sizer https://webapps.stackexchange.com/questions/129617/how-to-change-the-size-of-paper-in-google-docs-to-custom-size
If you need to change the text in your label before generate pdf or/and you need change the name of generated file, you can do it via script as well.

Here is a variant of the script that changes a font size in one of the cells if the label doesn't fit into one page.
function main() {
// input texts
var text = {};
text.matName = '<b>testing this to <u>see</u></b> if it <i>actually</i> works <i>e.coli</i>';
text.disposeWeek = 'end of semester';
text.prepper = 'John Ruppert';
text.className = 'Cell and <b>Molecular</b> Biology <u>Fall 2020</u> a few exercises a few exercises a few exercises a few exercises';
text.hazards = 'Lots of hazards';
// initial max font size for the 'matName'
var size = 10;
var doc_blob = set_text(text, size);
// if we got more than 1 page, reduce the font size and repeat
while ((size > 4) && (getNumPages(doc_blob) > 1)) {
size = size-0.5;
doc_blob = set_text(text, size);
}
// save pdf
DriveApp.createFile(doc_blob);
}
// this function takes texts and a size and put the texts into fields
function set_text(text, size) {
// make a copy
var copyFile = DriveApp.getFileById('1vA93FxGXcWLIEZBuQwec0n23cWGddyLoey-h0WR9weY').makeCopy();
var doc = DocumentApp.openById(copyFile.getId());
var body = doc.getBody();
// replace placeholders with data
body.replaceText('{matName}', text.matName);
body.replaceText('{disposeWeek}', text.disposeWeek);
body.replaceText('{prepper}', text.prepper);
body.replaceText('{className}', text.className);
body.replaceText('{hazards}', text.hazards);
// set font size for 'matName'
body.findText(text.matName).getElement().asText().setFontSize(size);
// make Italics, Bold and Underline
handle_tags(['<i>', '</i>'], body);
handle_tags(['<b>', '</b>'], body);
handle_tags(['<u>', '</u>'], body);
// save the doc
doc.saveAndClose();
// delete the copy
copyFile.setTrashed(true);
// return blob
return docblob = doc.getBlob().setName('Label.pdf');
}
// this function formats the text beween html tags
function handle_tags(tags, body) {
var start_tag = tags[0].toLowerCase();
var end_tag = tags[1].toLowerCase();
var found = body.findText(start_tag);
while (found) {
var elem = found.getElement();
var start = found.getEndOffsetInclusive();
var end = body.findText(end_tag, found).getStartOffset()-1;
switch (start_tag) {
case '<b>': elem.setBold(start, end, true); break;
case '<i>': elem.setItalic(start, end, true); break;
case '<u>': elem.setUnderline(start, end, true); break;
}
found = body.findText(start_tag, found);
}
body.replaceText(start_tag, '');
body.replaceText(end_tag, '');
}
// this funcion takes saved doc and returns the number of its pages
function getNumPages(doc) {
var blob = doc.getAs('application/pdf');
var data = blob.getDataAsString();
var pages = parseInt(data.match(/ \/N (\d+) /)[1], 10);
Logger.log("pages = " + pages);
return pages;
}
It looks rather awful and hopeless. It turned out that Google Docs has no page number counter. You need to convert your document into a PDF and to count pages of the PDF file. Gross!
Next problem, even if you managed somehow to count the pages, you have no clue which of the cells was overflowed. This script takes just one cell, changes its font size, counts pages, changes the font size again, etc. But it doesn't granted a success, because there can be another cell with long text inside. You can reduce font size of all the texts, but it doesn't look like a great idea as well.

allow arabic text in pdf table using itext7 (xamarin android)

I have to put my list data in a table in a pdf file. My data has some Arabic words. When my pdf is generated, the Arabic words don't appear. I searched and found that I need itext7.pdfcalligraph so I installed it in my app. I found this code too https://itextpdf.com/en/blog/technical-notes/displaying-text-different-languages-single-pdf-document and tried to do something similar to allow Arabic words in my table but I couldn't figure it out.
This is a trial code before I apply it to my real list:
var path2 = global::Android.OS.Environment.ExternalStorageDirectory.AbsolutePath;
filePath = System.IO.Path.Combine(path2.ToString(), "myfile2.pdf");
stream = new FileStream(filePath, FileMode.Create);
PdfWriter writer = new PdfWriter(stream);
PdfDocument pdf2 = new iText.Kernel.Pdf.PdfDocument(writer);
Document document = new Document(pdf2, PageSize.A4);
FontSet set = new FontSet();
set.AddFont("ARIAL.TTF");
document.SetFontProvider(new FontProvider(set));
document.SetProperty(Property.FONT, "Arial");
string[] sources = new string[] { "يوم","شهر 2020" };
iText.Layout.Element.Table table = new iText.Layout.Element.Table(2, false);
foreach (string source in sources)
{
Paragraph paragraph = new Paragraph();
Bidi bidi = new Bidi(source, Bidi.DirectionDefaultLeftToRight);
if (bidi.BaseLevel != 0)
{
paragraph.SetTextAlignment(iText.Layout.Properties.TextAlignment.RIGHT);
}
paragraph.Add(source);
table.AddCell(new Cell(1, 1).SetTextAlignment(iText.Layout.Properties.TextAlignment.CENTER).Add(paragraph));
}
document.Add(table);
document.Close();
I updated my code and added the arial.ttf to my assets folder . i'm getting the following exception:
System.InvalidOperationException: 'FontProvider and FontSet are empty. Cannot resolve font family name (see ElementPropertyContainer#setFontFamily) without initialized FontProvider (see RootElement#setFontProvider).'
and I still can't figure it out. any ideas?
thanks in advance

- C #
I have a similar situation for Turkish characters, and I've followed these
steps :
Create a folder under projects root folder which is : /wwwroot/Fonts
Add OpenSans-Regular.ttf under the Fonts folder
Path for font is => ../wwwroot/Fonts/OpenSans-Regular.ttf
Create font like below :
public static PdfFont CreateOpenSansRegularFont()
{
var path = "{Your absolute path for FONT}";
return PdfFontFactory.CreateFont(path, PdfEncodings.IDENTITY_H, true);
}
and use it like :
paragraph.Add(source)
.SetFont(FontFactory.CreateOpenSansRegularFont()); //set font in here
table.AddCell(new Cell(1, 1)
.SetTextAlignment(iText.Layout.Properties.TextAlignment.CENTER)
.Add(paragraph));
This is how I used font factory for Turkish characters ex: "ü,i,ç,ş,ö"
For Xamarin-Android, you could try
string documentsPath = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
var path = Path.Combine(documentsPath, "Fonts/Arial.ttf");

Look i fixed it in java,this may help you:
String font = "your Arabic font";
//the magic is in the next 4 lines:
PdfFontFactory.register(font);
FontProgram fontProgram = FontProgramFactory.createFont(font, true);
PdfFont f = PdfFontFactory.createFont(fontProgram, PdfEncodings.IDENTITY_H);
LanguageProcessor languageProcessor = new ArabicLigaturizer();
//and look here how i used setBaseDirection and don't use TextAlignment ,it will work without it
com.itextpdf.kernel.pdf.PdfDocument tempPdfDoc = new com.itextpdf.kernel.pdf.PdfDocument(new PdfReader(pdfFile.getPath()), TempWriter);
com.itextpdf.layout.Document TempDoc = new com.itextpdf.layout.Document(tempPdfDoc);
com.itextpdf.layout.element.Paragraph paragraph0 = new com.itextpdf.layout.element.Paragraph(languageProcessor.process("الاستماره الالكترونية--الاستماره الالكترونية--الاستماره الالكترونية--الاستماره الالكترونية"))
.setFont(f).setBaseDirection(BaseDirection.RIGHT_TO_LEFT)
.setFontSize(15);

How to add a Graph placeholder and Table Placeholder in PDF template using Aspose.dll

How to add a Graph placeholder and Table placeholder in PDF template using Aspose.dll
Please help me in this area.

You can add a graph to a PDF document by using the code below:
// Create Document instance
Document doc = new Document();
// Add page to pages collection of PDF file
Page page = doc.Pages.Add();
// Create Graph instance
Aspose.Pdf.Drawing.Graph graph = new Aspose.Pdf.Drawing.Graph(100, 400);
// Add graph object to paragraphs collection of page instance
page.Paragraphs.Add(graph);
// Create Rectangle instance
Aspose.Pdf.Drawing.Rectangle rect = new Aspose.Pdf.Drawing.Rectangle(100, 100, 200, 120);
// Specify fill color for Graph object
rect.GraphInfo.FillColor = Aspose.Pdf.Color.Red;
// Add rectangle object to shapes collection of Graph object
graph.Shapes.Add(rect);
dataDir = dataDir + "CreateFilledRectangle_out.pdf";
// Save PDF file
doc.Save(dataDir);
And, you can add a table in a PDF document by using the code below:
// Create Document instance
Aspose.Pdf.Document doc = new Aspose.Pdf.Document();
// Add page to pages collection of PDF file
Page page = doc.Pages.Add();
// Initializes a new instance of the Table
Aspose.Pdf.Table table = new Aspose.Pdf.Table();
// Set the table border color as LightGray
table.Border = new Aspose.Pdf.BorderInfo(Aspose.Pdf.BorderSide.All, .5f, Aspose.Pdf.Color.FromRgb(System.Drawing.Color.LightGray));
// Set the border for table cells
table.DefaultCellBorder = new Aspose.Pdf.BorderInfo(Aspose.Pdf.BorderSide.All, .5f, Aspose.Pdf.Color.FromRgb(System.Drawing.Color.LightGray));
// Create a loop to add 10 rows
for (int row_count = 1; row_count < 10; row_count++)
{
// Add row to table
Aspose.Pdf.Row row = table.Rows.Add();
// Add table cells
row.Cells.Add("Column (" + row_count + ", 1)");
row.Cells.Add("Column (" + row_count + ", 2)");
row.Cells.Add("Column (" + row_count + ", 3)");
}
// Add table object to first page of document
doc.Pages[1].Paragraphs.Add(table);
dataDir = dataDir + "document_with_table_out.pdf";
// Save updated document containing table object
doc.Save(dataDir);
You may visit following links for further information on these topics.
Working with Graphs
Working with Tables
P.S: I work with Aspose as Developer Evangelist.

Using pdfbox - how to get the font from a COSName?

How to get the font from a COSName?
The solution I'm looking for looks somehow like this:
COSDictionary dict = new COSDictionary();
dict.add(fontname, something); // fontname COSName from below code
PDFontFactory.createFont(dict);
If you need more background, I added the whole story below:
I try to replace some string in a pdf. This succeeds (as long as all text is stored in one token). In order to keep the format I like to re-center the text. As far as I understood I can do this by getting the width of the old string and the new one, do some trivial calculation and setting the new position.
I found some inspiration on stackoverflow for replacing https://stackoverflow.com/a/36404377 (yes it has some issues, but works for my simple pdf's. And How to center a text using PDFBox. Unfortunatly this example uses a font constant.
So using the first link's code I get a handling for operator 'TJ' and one for 'Tj'.
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
java.util.List<Object> tokens = parser.getTokens();
for (int j = 0; j < tokens.size(); j++)
{
Object next = tokens.get(j);
if (next instanceof Operator)
{
Operator op = (Operator) next;
// Tj and TJ are the two operators that display strings in a PDF
if (op.getName().equals("Tj"))
{
// Tj takes one operator and that is the string to display so lets
// update that operator
COSString previous = (COSString) tokens.get(j - 1);
String string = previous.getString();
String replaced = prh.getReplacement(string);
if (!string.equals(replaced))
{ // if changes are there, replace the content
previous.setValue(replaced.getBytes());
float xpos = getPosX(tokens, j);
//if (true) // center the text
if (6 * xpos > page.getMediaBox().getWidth()) // check if text starts right from 1/xth page width
{
float fontsize = getFontSize(tokens, j);
COSName fontname = getFontName(tokens, j);
// TODO
PDFont font = ?getFont?(fontname);
// TODO
float widthnew = getStringWidth(replaced, font, fontsize);
setPosX(tokens, j, page.getMediaBox().getWidth() / 2F - (widthnew / 2F));
}
replaceCount++;
}
}
Considering the code between the TODO tags, I will get the required values from the token list. (yes this code is awful, but for now it let's me concentrate on the main issue)
Having the string, the size and the font I should be able to call the getWidth(..) method from the sample code.
Unfortunatly I run into trouble to create a font from the COSName variable.
PDFont doesn't provide a method to create a font by name.
PDFontFactory looks fine, but requests a COSDictionary. This is the point I gave up and request help from you.

The names are associated with font objects in the page resources.
Assuming you use PDFBox 2.0.x and that page is a PDPage instance, you can resolve the name fontname using:
PDFont font = page.getResources().getFont(fontname);
But the warning from the comments to the questions you reference remain: This approach will work only for very simple PDFs and might even damage other ones.

try {
//Loading an existing document
File file = new File("UKRSICH_Mo6i-Spikyer_z1560-FAV.pdf");
PDDocument document = PDDocument.load(file);
PDPage page = document.getPage(0);
PDResources pageResources = page.getResources();
System.out.println(pageResources.getFontNames() );
for (COSName key : pageResources.getFontNames())
{
PDFont font = pageResources.getFont(key);
System.out.println("Font: " + font.getName());
}
document.close();
}

ColdFusion CFDOCUMENT with links to other PDFs

I am creating a PDF using the cfdocument tag at the moment. The PDF is not much more than a bunch of links to other PDFs.
So I create this PDF index and the links are all HREFs
Another PDF
if I set the localURL attribute to "no" my URLs have the whole web path in them:
Another PDF
if I set the localURL attribute to "yes" then I get:
Another PDF
So this index PDF is going to go onto a CD and all of the linked PDFs are going to sit right next to it so I need a relative link ... more like:
Another PDF
cfdocument does not seem to do this. I can modify the file name of the document and make it "File:///Another_PDF.pdf" but this does not work either because I don't know the driveletter of the CD drive ... or if the files are going to end up inside a directory on the CD.
Is there a way (possibly using iText or something) of opening up the PDF once it is created and converting the URL links to actual PDF GoTo tags?
I know this is kind of a stretch but I am at my wits end with this.
So I've managed to get into the Objects but I'm still struggling with.
Converting from:
5 0 obj<</C[0 0 1]/Border[0 0 0]/A<</URI(File:///75110_002.PDF)/S/URI>>/Subtype/Link/Rect[145 502 184 513]>>endobj
To this:
19 0 obj<</SGoToR/D[0/XYZ null null 0]/F(75110_002.PDF)>>endobj
20 0 obj<</Subtype/Link/Rect[145 502 184 513]/Border[0 0 0]/A 19 0 R>>endobj
Wow this is really kicking my ass! :)
So I've managed to get the document open, loop through the Link Annotations, capture the Rect co-ordinates and the linked to document name (saved into an array of Structures) and then successfully deleted the Annotation which was a URI Link.
So now I thought I could now loop over that array of structures and put the Annotations back into the document using the createLink method or the setAction method. But all the examples I've seen of these methods are attached to a Chunk (of text). But my document already has the Text in place so I don't need to remake the text links I just need to put the Links back in in the same spot.
So I figured I could reopen the document and look for the actual text that was the link and then attache the setAction to th ealready existing chunk of text .... I can't find the text!!
I suck! :)

This thread has an example of updating the link actions, by modifying the pdf annotations. It is written in iTextSharp 5.x, but the java code is not much different.
The thread provides a solid explanation of how annotations work. But to summarize, you need to read in your source pdf and loop through the individual pages for annotations. Extract the links and use something like getFileFromPath() to replace them with a file name only.
I was curious, so I did a quick and ugly conversion of the iTextSharp code above. Disclaimer, it is not highly tested:
/**
Usage:
util = createObject("component", "path.to.ThisComponent");
util.fixLinks( "c:/path/to/sourceFile.pdf", "c:/path/to/newFile.pdf");
*/
component {
/**
Convert all absolute links, in the given pdf, to relative links (file name only)
#source - absolute path to the source pdf file
#destination - absolute path to save copy
*/
public function fixLinks( string source, string destination) {
// initialize objects
Local.reader = createObject("java", "com.lowagie.text.pdf.PdfReader").init( arguments.source );
Local.pdfName = createObject("java", "com.lowagie.text.pdf.PdfName");
// check each page for hyperlinks
for ( Local.i = 1; Local.i <= Local.reader.getNumberOfPages(); Local.i++) {
//Get all of the annotations for the current page
Local.page = Local.reader.getPageN( Local.i );
Local.annotations = Local.page.getAsArray( Local.PdfName.ANNOTS ).getArrayList();
// search annotations for links
for (Local.x = 1; !isNull( Local.annotations) && Local.x < arrayLen(Local.annotations); Local.x++) {
// get current properties
Local.current = Local.annotations[ Local.x ];
Local.dictionary = Local.reader.getPdfObject( Local.current );
Local.subType = Local.dictionary.get( Local.PdfName.SUBTYPE );
Local.action = Local.dictionary.get( Local.PdfName.A );
Local.hasLink = true;
//Skip this item if it does not have a link AND action
if (Local.subType != Local.PdfName.LINK || isNull(Local.action)) {
Local.hasLink = false;
}
//Skip this item if it does not have a URI
if ( Local.hasLink && Local.action.get( Local.PdfName.S ) != Local.PdfName.URI ) {
Local.hasLink = false;
}
//If it is a valid URI, update link
if (Local.hasLink) {
// extract file name from URL
Local.oldLink = Local.action.get( Local.pdfName.URI );
Local.newLink = getFileFromPath( Local.oldLink );
// replace link
// WriteDump("Changed link from ["& Local.oldLink &"] ==> ["& Local.newLink &"]");
Local.pdfString = createObject("java", "com.lowagie.text.pdf.PdfString");
Local.action.put( Local.pdfName.URI, Local.pdfString.init( Local.newLink ) );
}
}
}
// save all pages to new file
copyPDF( Local.reader , arguments.destination );
}
/**
Copy all pages in pdfReader to the given destination file
#pdfReader - pdf to copy
#destination - absolute path to save copy
*/
public function copyPDF( any pdfReader, string destination) {
try {
Local.doc = createObject("java", "com.lowagie.text.Document").init();
Local.out = createObject("java", "java.io.FileOutputStream").init( arguments.destination );
Local.writer = createObject("java", "com.lowagie.text.pdf.PdfCopy").init(Local.doc, Local.out);
// open document and save individual pages
Local.doc.open();
for (Local.i = 1; i <= arguments.pdfReader.getNumberOfPages(); Local.i++) {
Local.writer.addPage( Local.writer.getImportedPage( arguments.pdfReader, Local.i) );
}
Local.doc.close();
}
finally
{
// cleanup
if (structKeyExists(Local, "doc")) { Local.doc.close(); }
if (structKeyExists(Local, "writer")) { Local.writer.close(); }
if (structKeyExists(Local, "out")) { Local.out.close(); }
}
}
}

I finally got it:
public function resetLinks( string source, string destination) {
try {
// initialize objects
Local.reader = createObject("java", "com.lowagie.text.pdf.PdfReader").init( arguments.source );
Local.pdfName = createObject("java", "com.lowagie.text.pdf.PdfName");
Local.annot = createObject("java", "com.lowagie.text.pdf.PdfAnnotation");
Local.out = createObject("java", "java.io.FileOutputStream").init( arguments.destination );
Local.stamper = createObject("java", "com.lowagie.text.pdf.PdfStamper").init(Local.reader, Local.out);
Local.PdfAction = createObject("java", "com.lowagie.text.pdf.PdfAction");
Local.PdfRect = createObject("java", "com.lowagie.text.Rectangle");
Local.PdfBorderArray = createObject("java", "com.lowagie.text.pdf.PdfBorderArray").init(javacast("float", "0"), javacast("float", "0"), javacast("float", "0"));
Local.newAnnots = [];
// check each page for hyperlinks
// Save the data to a structure then write it to an array
// then delete the hyperlink Annotation
for ( Local.i = 1; Local.i <= Local.reader.getNumberOfPages(); Local.i = Local.i + 1) {
//Get all of the annotations for the current page
Local.page = Local.reader.getPageN( Local.i );
Local.annotations = Local.page.getAsArray( Local.PdfName.ANNOTS ).getArrayList();
// search annotations for links
for (Local.x = arrayLen(Local.annotations); !isNull( Local.annotations) && Local.x > 0; Local.x--) {
// get current properties
Local.current = Local.annotations[ Local.x ];
Local.dictionary = Local.reader.getPdfObject( Local.current );
Local.subType = Local.dictionary.get( Local.PdfName.SUBTYPE );
Local.action = Local.dictionary.get( Local.PdfName.A );
Local.hasLink = true;
//Skip this item if it does not have a link AND action
if (Local.subType != Local.PdfName.LINK || isNull(Local.action)) {
Local.hasLink = false;
}
//Skip this item if it does not have a URI
if ( Local.hasLink && Local.action.get( Local.PdfName.S ) != Local.PdfName.URI ) {
Local.hasLink = false;
}
//If it is a valid URI, update link
if (Local.hasLink) {
// extract file name from URL
Local.oldLink = Local.action.get( Local.pdfName.URI );
Local.newLink = getFileFromPath( Local.oldLink );
Local.Rect = Local.dictionary.Get(PdfName.Rect);
arrayStruct = StructNew();
arrayStruct.rectSTR = Local.Rect.toString();
arrayStruct.link = Local.newLink;
arrayStruct.page = Local.i;
ArrayAppend(Local.newAnnots, arrayStruct);
// Delete
Local.annotations.remove(Local.current);
}
}
}
// Now really remove them!
Local.reader.RemoveUnusedObjects();
// Now loop over the saved annotations and put them back!!
for ( Local.z = 1; Local.z <= ArrayLen(Local.newAnnots); Local.z++) {
// Parse the rect we got save into an Array
theRectArray = ListToArray(ReplaceNoCase(ReplaceNoCase(Local.newAnnots[z].rectSTR, "[", ""), "]", ""));
// Create the GoToR action
theAction = Local.PdfAction.gotoRemotePage(javacast("string", '#Local.newAnnots[z].link#'), javacast("string", '#Local.newAnnots[z].link#'), javacast("boolean", "false"), javacast("boolean", "false"));
// Create the Link Annotation with the above Action and the Rect
theAnnot = Local.annot.createLink(Local.stamper.getWriter(), Local.PdfRect.init(javacast("int", theRectArray[1]), javacast("int", theRectArray[2]), javacast("int", theRectArray[3]), javacast("int", theRectArray[4])), Local.annot.HIGHLIGHT_INVERT, theAction);
// Remove the border the underlying underlined text will flag item as a link
theAnnot.setBorder(Local.PdfBorderArray);
// Add the Annotation to the Page
Local.stamper.addAnnotation(theAnnot, Local.newAnnots[z].page);
}
}
finally {
// cleanup
if (structKeyExists(Local, "reader")) { Local.reader.close(); }
if (structKeyExists(Local, "stamper")) { Local.stamper.close(); }
if (structKeyExists(Local, "out")) { Local.out.close(); }
}
}
I couldn't have done this without the help of Leigh!!

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

PDFBox inserting data into multi-page doc - pdf

Related

Printing to pdf from Google Apps Script HtmlOutput

allow arabic text in pdf table using itext7 (xamarin android)

How to add a Graph placeholder and Table Placeholder in PDF template using Aspose.dll

Using pdfbox - how to get the font from a COSName?

ColdFusion CFDOCUMENT with links to other PDFs

Categories

Resources