FlyingSaucerPdf , Greek alphabet, Helvetica

FlyingSaucerPdf , Greek alphabet, Helvetica - pdf

In the project, I am working on we use Flying Saucer Pdf (9.0.9) & iText (2.1.7) for the creation of pdfs. By setting the Helvetica font, the characters of the Greek alphabet are not represented on the PDF. While changing the font Arial or Times New Roman the characters are represented correctly.
<div>
<span style="font-family: Helvetica, Arial, sans-serif;">
<strong>
<span style="font-size: 16pt;">¶µαβω♥µ</span>
</strong>
</span>
</div>
Is there any workaround?
https://github.com/flyingsaucerproject/flyingsaucer

I was only able to fix it by removing the preloaded Helvetica font. After that, I added a new Helvetica font. For testing I used https://www.cufonfonts.com/download/font/helvetica-255 font.
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.lang.reflect.Field;
import java.nio.charset.StandardCharsets;
import java.util.Map;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.junit.Test;
import org.w3c.dom.Document;
import org.xhtmlrenderer.pdf.ITextFontResolver;
import org.xhtmlrenderer.pdf.ITextRenderer;
import org.xhtmlrenderer.resource.FSEntityResolver;
import com.lowagie.text.pdf.BaseFont;
public class TestPdf2 {
private static final String FILE_OUT = "src/test/resources/file.pdf";
private static final String FONT_DIR = "src/test/resources/fonts/";
private static final String[] FONT_FILES = { //
"Helvetica.ttf", "Helvetica-Bold.ttf", //
"Helvetica-BoldOblique.ttf","helvetica-compressed-5871d14b6903a.otf", //
"helvetica-light-587ebe5a59211.ttf", "Helvetica-Oblique.ttf", //
"helvetica-rounded-bold-5871d05ead8de.otf" };
#Test
public void buildHelvetica() throws Exception {
ITextRenderer renderer = new ITextRenderer();
ITextFontResolver fontRes = renderer.getFontResolver();
Field fontFagliesField = ITextFontResolver.class.getDeclaredField("_fontFamilies");
fontFagliesField.setAccessible(true);
Map<String, ?> fontFamiliesMap = (Map<String, ?>) fontFagliesField.get(fontRes);
fontFamiliesMap.remove("Helvetica");
loadFont(fontRes);
renderer.setDocument(getSampleDocument(), null);
doRenderToPDF(renderer, FILE_OUT);
}
private static void loadFont(ITextFontResolver poF) throws Exception {
for (String lsFontFile : FONT_FILES) {
File lfFile = new File(FONT_DIR + lsFontFile);
poF.addFont(lfFile.getAbsolutePath(), BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
}
}
private static void doRenderToPDF(ITextRenderer renderer, String pdf) throws Exception {
try (OutputStream os = new FileOutputStream(pdf)) {
renderer.layout();
renderer.createPDF(os);
}
}
private static Document getSampleDocument() throws Exception {
String lsSampleGreekDoc = //
"<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">" //
+ "<html xmlns=\"http://www.w3.org/1999/xhtml\">\r\n" //
+ "<head>\r\n" //
+ "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\r\n" //
+ "<title>Test</title>\r\n"//
+ "</head>\r\n" //
+ "<body>\r\n" //
+ "<div style=\"font-family: Helvetica; font-size: 16pt;\">\r\n" //
+ " <del>Helvetica:</del>1¶ 2µ 3α 4β 5δ 6ε " //
+ "7ζ 8ω 9♥ 10µ 11Θ 12Ξ 13Σ\r\n" //
+ "</div>\r\n" //
+ "</body>\r\n" //
+ "</html>"; //
DocumentBuilder loDocumentBuilderXml = DocumentBuilderFactory.
newInstance().newDocumentBuilder();
loDocumentBuilderXml.setEntityResolver(FSEntityResolver.instance());
ByteArrayInputStream loArrayInputStream = new ByteArrayInputStream(lsSampleGreekDoc.getBytes(StandardCharsets.UTF_8));
return loDocumentBuilderXml.parse(loArrayInputStream);
}
}

Related

How can I add source PDF content to destination PDF using iText 7 without losing the header and footer?

I am using iText 7. I have two PDF files. The source PDF has some content. The destination PDF has header and footer. I have a requirement to add the content from source PDF to destination PDF in the middle of the page without overlapping header and footer of the destination PDF. What should the code be?
Below is my code and attached document is the screenshot of the source PDF file which needs to be embedded in the final.pdf file:
import java.io.File;
import java.io.FileOutputStream;
import java.net.MalformedURLException;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.Map;
import java.util.UUID;
import com.itextpdf.io.font.FontProgram;
import com.itextpdf.io.font.FontProgramFactory;
import com.itextpdf.io.font.PdfEncodings;
import com.itextpdf.io.image.ImageDataFactory;
import com.itextpdf.kernel.events.Event;
import com.itextpdf.kernel.events.IEventHandler;
import com.itextpdf.kernel.events.PdfDocumentEvent;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.kernel.geom.PageSize;
import com.itextpdf.kernel.geom.Rectangle;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfPage;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.kernel.pdf.canvas.PdfCanvas;
import com.itextpdf.kernel.pdf.xobject.PdfFormXObject;
import com.itextpdf.layout.Canvas;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.Style;
import com.itextpdf.layout.borders.Border;
import com.itextpdf.layout.borders.SolidBorder;
import com.itextpdf.layout.element.Cell;
import com.itextpdf.layout.element.Image;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.layout.element.Table;
import com.itextpdf.layout.element.Text;
import com.itextpdf.layout.font.FontProvider;
import com.itextpdf.layout.property.HorizontalAlignment;
import com.itextpdf.layout.property.TextAlignment;
import com.itextpdf.layout.property.UnitValue;
import com.itextpdf.layout.property.VerticalAlignment;
public class TestPdf {
public static void main(String[] args) {
String uuid = UUID.randomUUID().toString();
try {
#SuppressWarnings("resource")
PdfWriter writer = new PdfWriter(new FileOutputStream(new File(Paths.get("Output").toAbsolutePath()+"/final.pdf"))).setSmartMode(true);
PdfDocument pdfDoc = new PdfDocument(writer);
pdfDoc.setDefaultPageSize(PageSize.A4.rotate());
String fonts[] = {Paths.get("fonts").toAbsolutePath() + "/TREBUC.TTF", Paths.get("fonts").toAbsolutePath() + "/TREBUCBD.TTF", Paths.get("fonts").toAbsolutePath() + "/TREBUCBI.TTF",Paths.get("fonts").toAbsolutePath() + "/TREBUCIT.TTF"};
FontProvider fontProvider = new FontProvider();
Map<String, PdfFont> pdfFontMap = new HashMap();
for (String font : fonts) {
FontProgram fontProgram = FontProgramFactory.createFont(font);
if(font.endsWith("TREBUC.TTF")) {
pdfFontMap.put("NORMAL", PdfFontFactory.createFont(fontProgram, PdfEncodings.WINANSI, true));
} else if(font.endsWith("TREBUCBD.TTF")) {
pdfFontMap.put("BOLD", PdfFontFactory.createFont(fontProgram, PdfEncodings.WINANSI, true));
} else if(font.endsWith("TREBUCBI.TTF")) {
pdfFontMap.put("BOLD_ITALIC", PdfFontFactory.createFont(fontProgram, PdfEncodings.WINANSI, true));
} else if(font.endsWith("TREBUCIT.TTF")) {
pdfFontMap.put("ITALIC", PdfFontFactory.createFont(fontProgram, PdfEncodings.WINANSI, true));
}
fontProvider.addFont(fontProgram);
}
TestPdf testPdf = new TestPdf();
NormalPageHeader headerHandler = testPdf.new NormalPageHeader(Paths.get("images").toAbsolutePath() + "\\logo.png", pdfFontMap);
pdfDoc.addEventHandler(PdfDocumentEvent.START_PAGE, headerHandler);
PageEndEvent pageEndEvent = testPdf.new PageEndEvent(Paths.get("images").toAbsolutePath() + "\\FooterLineExternal.png" ,pdfFontMap);
pdfDoc.addEventHandler(PdfDocumentEvent.END_PAGE, pageEndEvent);
Document doc = new Document(pdfDoc);
doc.getPageEffectiveArea(PageSize.A4.rotate());
Table imageTable = new Table(1);
imageTable.setBorder(Border.NO_BORDER);
imageTable.setWidth(UnitValue.createPercentValue(100));
Cell cell = new Cell();
Paragraph paragraph = new Paragraph("Title");
paragraph.setVerticalAlignment(VerticalAlignment.TOP);
cell.add(paragraph);
cell.setBorder(Border.NO_BORDER);
cell.setPaddingTop(50);
imageTable.addCell(cell);
doc.add(imageTable);
doc.close();
System.out.println("Converted to PDF Succesfully >>> convertedSvg_"+uuid+".pdf");
}
catch(Exception e){
e.printStackTrace();
System.out.println("Error Occured while converting to PDF = " + e.getMessage());
}
}
class NormalPageHeader implements IEventHandler {
String header;
Map<String, PdfFont> font;
public NormalPageHeader(String header, Map<String, PdfFont> font) {
this.header = header;
this.font = font;
}
#Override
public void handleEvent(Event event) {
//Retrieve document and
PdfDocumentEvent docEvent = (PdfDocumentEvent) event;
PdfDocument pdf = docEvent.getDocument();
PdfPage page = docEvent.getPage();
Rectangle pageSize = page.getPageSize();
PdfCanvas pdfCanvas = new PdfCanvas(
page.getLastContentStream(), page.getResources(), pdf);
Canvas canvas = new Canvas(pdfCanvas, pdf, pageSize);
canvas.setFontSize(10f);
Table table = new Table(3);
table.setBorder(Border.NO_BORDER);
table.setWidth(UnitValue.createPercentValue(100));
Cell leftCell = new Cell();
leftCell.setFont(font.get("NORMAL"));
leftCell.setPaddingTop(15);
leftCell.setPaddingLeft(20);
leftCell.setBorder(Border.NO_BORDER);
leftCell.setBorderBottom(new SolidBorder(0.5f));
leftCell.setWidth(UnitValue.createPercentValue(33.3f));
Text userLabel = new Text("Username: ");
userLabel.setBold();
Paragraph paragraph = new Paragraph(userLabel);
Cell middleCell = new Cell();
middleCell.setFont(font.get("NORMAL"));
middleCell.setPaddingTop(15);
middleCell.setBorder(Border.NO_BORDER);
middleCell.setBorderBottom(new SolidBorder(0.5f));
middleCell.setWidth(UnitValue.createPercentValue(33.3f));
paragraph = new Paragraph("Main Header");
paragraph.setTextAlignment(TextAlignment.CENTER);
paragraph.setBold();
paragraph.setFontSize(12);
middleCell.add(paragraph);
String programString = "Sample header";
paragraph = new Paragraph(programString);
paragraph.setTextAlignment(TextAlignment.CENTER);
paragraph.setBold();
paragraph.setFontSize(10);
middleCell.add(paragraph);
table.addCell(middleCell);
Cell rightCell = new Cell();
rightCell.setFont(font.get("NORMAL"));
rightCell.setPaddingTop(20);
rightCell.setWidth(UnitValue.createPercentValue(33.3f));
rightCell.setHorizontalAlignment(HorizontalAlignment.RIGHT);
rightCell.setBorder(Border.NO_BORDER);
rightCell.setBorderBottom(new SolidBorder(0.5f));
rightCell.setPaddingRight(20);
//Write text at position
Image img;
try {
img = new Image(ImageDataFactory.create(header));
img.setHorizontalAlignment(HorizontalAlignment.RIGHT);
Style style = new Style();
style.setWidth(91);
style.setHeight(25);
img.addStyle(style);
rightCell.add(img);
table.addCell(rightCell);
table.setMarginLeft(15);
table.setMarginRight(15);
canvas.add(table);
}
catch (MalformedURLException e) {
e.printStackTrace();
}
}
}
class PageEndEvent implements IEventHandler {
protected PdfFormXObject placeholder;
protected float side = 20;
protected float x = 300;
protected float y = 10;
protected float space = 4.5f;
private String bar;
protected float descent = 3;
Map<String, PdfFont> font;
public PageEndEvent(String bar, Map<String, PdfFont> font) {
this.bar = bar;
this.font = font;
placeholder =new PdfFormXObject(new Rectangle(0, 0, side, side));
}
#Override
public void handleEvent(Event event) {
Table table = new Table(3);
table.setBorder(Border.NO_BORDER);
table.setWidth(UnitValue.createPercentValue(100));
Cell confCell = new Cell();
confCell.setFont(font.get("NORMAL"));
confCell.setPaddingTop(15);
confCell.setPaddingLeft(20);
confCell.setBorder(Border.NO_BORDER);
confCell.setBorderBottom(new SolidBorder(0.5f));
confCell.setWidth(UnitValue.createPercentValue(100));
PdfDocumentEvent docEvent = (PdfDocumentEvent) event;
PdfDocument pdf = docEvent.getDocument();
PdfPage page = docEvent.getPage();
Rectangle pageSize = page.getPageSize();
PdfCanvas pdfCanvas = new PdfCanvas(
page.getLastContentStream(), page.getResources(), pdf);
Canvas canvas = new Canvas(pdfCanvas, pdf, pageSize);
Image img;
try {
img = new Image(ImageDataFactory.create(bar));
img.setHorizontalAlignment(HorizontalAlignment.LEFT);
Style style = new Style();
style.setWidth(UnitValue.createPercentValue(100));
style.setHeight(50);
img.addStyle(style);
Paragraph p = new Paragraph().add("Test: Confidential");
p.setFont(font.get("NORMAL"));
p.setFontSize(8);
p.setFontColor(com.itextpdf.kernel.colors.ColorConstants.GRAY);
canvas.showTextAligned(p, x, y, TextAlignment.CENTER);
pdfCanvas.addXObject(placeholder, x + space, y - descent);
pdfCanvas.release();
}
catch (MalformedURLException e) {
e.printStackTrace();
}
}
public void writeTotal(PdfDocument pdf) {
Canvas canvas = new Canvas(placeholder, pdf);
canvas.showTextAligned(String.valueOf(pdf.getNumberOfPages()),
0, descent, TextAlignment.LEFT);
}
}
}

First off, some words on the iText architecture behind some constructs you use:
When you use a Document instance to add content to a document that iText shall layout automatically, the assumption is that the area where iText can layout stuff is the whole page minus the page margins.
Thus, if you add further page material via other channels than the Document, e.g. like you do in your NormalPageHeader headerHandler and your PageEndEvent pageEndEvent, it is your responsibility to do so outside the layout area explained above, i.e. in the margin areas. (Unless that additional material is background stuff, like a water sign...)
For this you should set the margins large enough to guarantee that your further material is in the margins. By default the page margins are set to 36pt on each side of the page which usually is enough for a single line header or footer but not really for multi-line ones.
In your code you create a header which requires at least some 52pt plus a bit to prevent the content iText will layout from touching the header line.
Keeping that in mind it is pretty straight forward to insert a given PdfPage sourcePage into your page:
...
NormalPageHeader headerHandler = testPdf.new NormalPageHeader(Paths.get("images").toAbsolutePath() + "\\logo.png", pdfFontMap);
pdfDoc.addEventHandler(PdfDocumentEvent.START_PAGE, headerHandler);
PageEndEvent pageEndEvent = testPdf.new PageEndEvent(Paths.get("images").toAbsolutePath() + "\\FooterLineExternal.png" ,pdfFontMap);
pdfDoc.addEventHandler(PdfDocumentEvent.END_PAGE, pageEndEvent);
Document doc = new Document(pdfDoc);
doc.setTopMargin(55);
PdfFormXObject xobject = sourcePage.copyAsFormXObject(pdfDoc);
Rectangle xobjectBoundaryBox = xobject.getBBox().toRectangle();
xobject.getPdfObject().put(PdfName.Matrix, new PdfArray(new float[] {1, 0, 0, 1, -xobjectBoundaryBox.getLeft(), -xobjectBoundaryBox.getBottom()}));
Image image = new Image(xobject);
image.setAutoScale(true);
doc.add(image);
doc.close();
...
(excerpt from InsertInSpace helper insertIntoNithinTestFile)
If you use the original source page as is, the above code will insert it including all margin space. If you don't want this but instead cut that space of, you can proceed as follows to determine the actual bounding box of the page content, reduce the page to that box, and forward it to the method insertIntoNithinTestFile above, assuming page 1 of PdfDocument pdfDocument shall be processed:
PdfDocumentContentParser contentParser = new PdfDocumentContentParser(pdfDocument);
MarginFinder strategy = contentParser.processContent(1, new MarginFinder());
PdfPage page = pdfDocument.getPage(1);
page.setCropBox(strategy.getBoundingBox());
page.setMediaBox(strategy.getBoundingBox());
insertIntoNithinTestFile(page, "test-InsertIntoNithinTestFile.pdf");
(InsertInSpace test testInsertSimpleTestPdf)
The MarginFinder is a port of the iText5 MarginFinder to iText 7:
public class MarginFinder implements IEventListener {
public Rectangle getBoundingBox() {
return boundingBox != null ? boundingBox.clone() : null;
}
#Override
public void eventOccurred(IEventData data, EventType type) {
if (data instanceof ImageRenderInfo) {
ImageRenderInfo imageData = (ImageRenderInfo) data;
Matrix ctm = imageData.getImageCtm();
for (Vector unitCorner : UNIT_SQUARE_CORNERS) {
Vector corner = unitCorner.cross(ctm);
addToBoundingBox(new Rectangle(corner.get(Vector.I1), corner.get(Vector.I2), 0, 0));
}
} else if (data instanceof TextRenderInfo) {
TextRenderInfo textRenderInfo = (TextRenderInfo) data;
addToBoundingBox(textRenderInfo.getAscentLine().getBoundingRectangle());
addToBoundingBox(textRenderInfo.getDescentLine().getBoundingRectangle());
} else if (data instanceof PathRenderInfo) {
PathRenderInfo renderInfo = (PathRenderInfo) data;
if (renderInfo.getOperation() != PathRenderInfo.NO_OP)
{
Matrix ctm = renderInfo.getCtm();
Path path = renderInfo.getPath();
for (Subpath subpath : path.getSubpaths())
{
for (Point point2d : subpath.getPiecewiseLinearApproximation())
{
Vector vector = new Vector((float)point2d.getX(), (float)point2d.getY(), 1);
vector = vector.cross(ctm);
addToBoundingBox(new Rectangle(vector.get(Vector.I1), vector.get(Vector.I2), 0, 0));
}
}
}
} else if (data != null) {
logger.fine(String.format("Ignored %s event, class %s.", type, data.getClass().getSimpleName()));
} else {
logger.fine(String.format("Ignored %s event with null data.", type));
}
}
#Override
public Set<EventType> getSupportedEvents() {
return null;
}
void addToBoundingBox(Rectangle rectangle) {
if (boundingBox == null)
boundingBox = rectangle.clone();
else
boundingBox = Rectangle.getCommonRectangle(boundingBox, rectangle);
}
Rectangle boundingBox = null;
Logger logger = Logger.getLogger(MarginFinder.class.getName());
static List<Vector> UNIT_SQUARE_CORNERS = Arrays.asList(new Vector(0,0,1), new Vector(1,0,1), new Vector(1,1,1), new Vector(0,1,1));
}
(MarginFinder.java)

Add a watermark on a pdf that contains images using pdfbox (1.7)

I have used the code suggested in:
PDFBox Overlay fails
to add a watermark to an existing pdf.
Unfortunately, the pdf produced is corrupted. The pdf reader complains when I open the document: "An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem".
The document is opened but it does not show the images.
It seems to happen with all the pdfs. It could be worth saying that it happens also with a different implementation that simply uses the Overlay class.
The following url points to a pdf that I used for my testing:
A pdf with an image
The code to test this transformation is:
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.pdfbox.cos.COSDictionary;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.exceptions.COSVisitorException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDStream;
import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
import org.apache.pdfbox.pdmodel.graphics.PDExtendedGraphicsState;
import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm;
import org.apache.pdfbox.util.MapUtil;
/**
* This test is about overlaying with special effect.
*
* #author mkl
*/
public class OverlayWithEffect
{
final static File RESULT_FOLDER = new File("target/test-outputs", "assembly");
public static void overlayWithDarkenBlendMode(PDDocument document, PDDocument overlay) throws IOException
{
PDXObjectForm xobject = importAsXObject(document, (PDPage) overlay.getDocumentCatalog().getAllPages().get(0));
PDExtendedGraphicsState darken = new PDExtendedGraphicsState();
darken.getCOSDictionary().setName("BM", "Darken");
List<PDPage> pages = document.getDocumentCatalog().getAllPages();
for (PDPage page: pages)
{
if (page.getResources() == null) {
page.setResources(page.findResources());
}
if (page.getResources() != null) {
Map<String, PDExtendedGraphicsState> states = page.getResources().getGraphicsStates();
if (states == null) {
states = new HashMap<String, PDExtendedGraphicsState>();
}
String darkenKey = MapUtil.getNextUniqueKey(states, "Dkn");
states.put(darkenKey, darken);
page.getResources().setGraphicsStates(states);
PDPageContentStream stream = new PDPageContentStream(document, page, true, false, true);
stream.appendRawCommands(String.format("/%s gs ", darkenKey));
stream.drawXObject(xobject, 0, 0, 1, 1);
stream.close();
}
}
}
public static PDXObjectForm importAsXObject(PDDocument target, PDPage page) throws IOException
{
final PDStream xobjectStream = new PDStream(target, page.getContents().createInputStream(), false);
final PDXObjectForm xobject = new PDXObjectForm(xobjectStream);
xobject.setResources(page.findResources());
xobject.setBBox(page.findCropBox());
COSDictionary group = new COSDictionary();
group.setName("S", "Transparency");
group.setBoolean(COSName.getPDFName("K"), true);
xobject.getCOSStream().setItem(COSName.getPDFName("Group"), group);
return xobject;
}
public static void main(String[] args) throws COSVisitorException, IOException
{
InputStream sourceStream = new FileInputStream("x:/pdf-test.pdf");
InputStream overlayStream = new FileInputStream("x:/draft.pdf");
try {
final PDDocument document = PDDocument.load(sourceStream);
final PDDocument overlay = PDDocument.load(overlayStream);
overlayWithDarkenBlendMode(document, overlay);
document.save("x:/da-draft-5.pdf");
document.close();
}
finally {
sourceStream.close();
overlayStream.close();
}
}
}
I am using version 1.7 of pdfbox.
Thanks

As suggested by mkl, it is probably an issue with the version of pdfbox that I am using.

How to generate a valid PDF/A file using iText and XMLWorker (HTML to PDF/A process)

I'm currently developing a method that will accept HTML input and convert it into a valid PDF/A file. I know how to programmatically construct a valid PDF/A file using iText (reference: http://itextsupport.com/download/pdfa3.html) but I'm unable to generate a valid PDF/A file using HTML as input and using XMLWorker to transform this input into a PDF file. The problem that I have right now is due to the embedded fonts requirement of the PDF/A format. I always get this exception:
Exception in thread "main" com.itextpdf.text.pdf.PdfAConformanceException: All the fonts must be embedded. This one isn't: Helvetica
I try to force which fonts will the HTML input use via a CSS file and I register the fonts I want to use in the output PDF file via the XMLWorkerFontProvider class, but it seems I'm doing something wrong because the exception commented above is always thrown.
What else do I need in order to XMLWorker uses the fonts registered via XMLWorkerFontProvider class? I want to avoid the use of the default font Helvetica in every HTML element present in the input.
Below is the code I'm using for testing:
style.css (just 1 line):
* { font: normal 100% Arial, sans-serif !important; }
Main.java:
package com.itextpdf;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.Reader;
import java.io.StringReader;
import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.ICC_Profile;
import com.itextpdf.text.pdf.PdfAConformanceLevel;
import com.itextpdf.text.pdf.PdfAWriter;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerFontProvider;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.css.CssFile;
import com.itextpdf.tool.xml.css.StyleAttrCSSResolver;
import com.itextpdf.tool.xml.html.CssAppliers;
import com.itextpdf.tool.xml.html.CssAppliersImpl;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;
public class Main {
/**
* #param args
*/
public static void main(String[] args) {
StringBuffer buf = new StringBuffer();
buf.append("<!DOCTYPE html>");
buf.append("<html>");
buf.append("<head>");
buf.append("<title>Test</title>");
buf.append("</head>");
buf.append("<body>");
buf.append("<p>This is a test</p>");
buf.append("</body>");
buf.append("</html>");
OutputStream file = null;
Document document = null;
PdfAWriter writer = null;
try {
file = new FileOutputStream(new File("C:\\Users\\amartin\\Desktop\\Test.pdf"));
document = new Document();
writer = PdfAWriter.getInstance(document, file, PdfAConformanceLevel.PDF_A_1B);
// Create XMP metadata. It's a PDF/A requirement.
writer.createXmpMetadata();
document.open();
// Set output intent. PDF/A requirement.
ICC_Profile icc = ICC_Profile.getInstance(new FileInputStream("./src/main/resources/com/itextpdf/sRGB Color Space Profile.icm"));
writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
// CSS
CSSResolver cssResolver = new StyleAttrCSSResolver();
CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream("./css/style.css"));
cssResolver.addCss(cssFile);
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
fontProvider.register("./fonts/arial.ttf");
fontProvider.register("./fonts/sans-serif.ttf");
fontProvider.addFontSubstitute("lowagie", "garamond");
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
Reader reader = new StringReader(buf.toString());
p.parse(reader);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document != null && document.isOpen())
document.close();
try {
if (file != null)
file.close();
} catch (IOException e) {}
if (writer != null && !writer.isCloseStream())
writer.close();
}
}
}
edit:
Answering to Bruno, I have extended the FontFactoryImp class overriding the getFont() method (the one that has all the arguments). It calls the the System.out.println function like this:
System.out.println("=fontname: " + fontname + " =encoding: " + encoding + " =embedded : " + embedded + " =size: " + size + " =style: " + style + " =BaseColor: " + color)
and then calls parent.getFont() method with the same arguments. The only output I see is this:
=fontname: null =encoding: Cp1252 =embedded : true =size: -1.0 =style: -1 =BaseColor: null
=fontname: null =encoding: Cp1252 =embedded : true =size: -1.0 =style: -1 =BaseColor: null
and the exception thrown, pasted before this code.

Based on the feedback you're sending to the System.out, it seems that XML Worker doesn't pick up the font family you want to use.
Please specify the font family like this:
font-family: "Arial"
Using 'font' in CSS may work, but it's tricky. I think iText sees normal and interprets it as Use the default font.

The complete code that makes this example work is the following:
style.css:
* {
font-family: "Arial";
font-style: normal;
}
Main.java:
package com.itextpdf;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.Reader;
import java.io.StringReader;
import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.ICC_Profile;
import com.itextpdf.text.pdf.PdfAConformanceLevel;
import com.itextpdf.text.pdf.PdfAWriter;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.css.CssFile;
import com.itextpdf.tool.xml.css.StyleAttrCSSResolver;
import com.itextpdf.tool.xml.html.CssAppliers;
import com.itextpdf.tool.xml.html.CssAppliersImpl;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;
public class Main {
public static void main(String[] args) {
StringBuffer buf = new StringBuffer();
String title = "Test";
// Sample HTML content.
buf.append("<!DOCTYPE html>");
buf.append("<html>");
buf.append("<head>");
buf.append("<title>" + title + "</title>");
buf.append("</head>");
buf.append("<body>");
buf.append("<p>This is a test</p>");
buf.append("</body>");
buf.append("</html>");
OutputStream file = null;
Document document = null;
PdfAWriter writer = null;
try {
file = new FileOutputStream(new File("C:\\Users\\amartin\\Desktop\\Test.pdf"));
document = new Document();
writer = PdfAWriter.getInstance(document, file, PdfAConformanceLevel.PDF_A_1B);
// Avoid discrepances between document title and XMP metadata information.
document.addTitle(title);
// Create XMP metadata. It's a PDF/A requirement.
writer.createXmpMetadata();
document.open();
// Set output intent. PDF/A requirement.
ICC_Profile icc = ICC_Profile.getInstance(new FileInputStream("./src/main/resources/com/itextpdf/sRGB Color Space Profile.icm"));
writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
// CSS stylesheet.
CSSResolver cssResolver = new StyleAttrCSSResolver();
CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream("./css/style.css"));
cssResolver.addCss(cssFile);
MyFontProvider fontProvider = new MyFontProvider();
fontProvider.register("./fonts/arial.ttf");
/* DEBUG
System.out.println("Fonts present in " + fontProvider.getClass().getName());
Set<String> registeredFonts = fontProvider.getRegisteredFonts();
for (String font : registeredFonts)
System.out.println(font);
*/
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines.
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
Reader reader = new StringReader(buf.toString());
p.parse(reader);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document != null && document.isOpen())
document.close();
try {
if (file != null)
file.close();
} catch (IOException e) {}
if (writer != null && !writer.isCloseStream())
writer.close();
}
}
}
MyFontProvider.java:
package com.itextpdf;
import com.itextpdf.text.BaseColor;
import com.itextpdf.text.Font;
import com.itextpdf.text.FontFactoryImp;
public class MyFontProvider extends FontFactoryImp {
#Override
public Font getFont(String fontname, String encoding, boolean embedded,
float size, int style, BaseColor color) {
System.out.println("=fontname: " + fontname + " =encoding: " + encoding + " =embedded : " + embedded + " =size: " + size + " =style: " + style + " =BaseColor: " + color);
return super.getFont(fontname, encoding, embedded, size, style, color);
}
}
Again, thank you, Bruno. I'm really glad to get your help here :)

create or fill a pdf with monodroid

I'm looking for a solution for creating a pdf-file with monodroid. It might be also a pdf-form in which I would fill in the content. I tried different librarys like pdfsharp_on_mono or itextsharp, but it doesn't work. Creating a new empty pdf-file is no problem. But when I try to fill in content, there are always errors.
My goal is to have a PDF-file, which at a later time should be filled through a xml-file. At the moment I would be happy if I just could create a pdf and "write" something in it.
Has anyone a hint, how I can realize it? I'm a really noob in monodroid.
If you need code or error messages, just say. I have tried different solutions.
cheers
anna
ps: sorry for my bad english.

How to create PDFs in an Android app?
You can write java class that will do all job with pdf using existing solutions for java, and then using JNIEnv create proxy class for it and use it in managed code.

Using iText library we can create pdf file in android application.
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import com.itextpdf.text.Anchor;
import com.itextpdf.text.BadElementException;
import com.itextpdf.text.Chapter;
import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Element;
import com.itextpdf.text.Font;
import com.itextpdf.text.Paragraph;
import com.itextpdf.text.Phrase;
import com.itextpdf.text.Section;
import com.itextpdf.text.pdf.PdfPCell;
import com.itextpdf.text.pdf.PdfPTable;
import com.itextpdf.text.pdf.PdfWriter;
import android.support.v7.app.ActionBarActivity;
import android.os.Bundle;
import android.os.Environment;
import android.view.Menu;
import android.view.MenuItem;
import android.view.View;
import android.view.View.OnClickListener;
import android.widget.Button;
public class CreatePdf extends ActionBarActivity implements OnClickListener {
private static String FILE = Environment.getExternalStorageDirectory()
.getAbsolutePath() + "/filename.pdf";
private static Font catFont = new Font(Font.FontFamily.TIMES_ROMAN, 18,
Font.BOLD);
private static Font subFont = new Font(Font.FontFamily.TIMES_ROMAN, 16,
Font.BOLD);
Button createPdf;
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_create_pdf);
createPdf = (Button) findViewById(R.id.createBtn);
createPdf.setOnClickListener(this);
}
private void createPdf() {
try {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(FILE));
document.open();
addContent(document);
document.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (DocumentException e) {
e.printStackTrace();
}
}
private static void addContent(Document document) throws DocumentException {
Anchor anchor = new Anchor("Anchor", catFont);
anchor.setName("Hello PDF");
// Second parameter is the number of the chapter
Chapter catPart = new Chapter(0);
Paragraph subPara = new Paragraph("Android PDF Created", subFont);
addEmptyLine(subPara, 1);
Section subCatPart = catPart.addSection(subPara);
Paragraph paragraph = new Paragraph();
addEmptyLine(paragraph, 5);
// subCatPart.add(paragraph);
// Add a table
createTable(subCatPart);
// Now add all this to the document
document.add(catPart);
}
private static void createTable(Section subCatPart)
throws BadElementException {
PdfPTable table = new PdfPTable(4);
PdfPCell c1 = new PdfPCell(new Phrase("Cell 1"));
c1.setHorizontalAlignment(Element.ALIGN_CENTER);
table.addCell(c1);
c1 = new PdfPCell(new Phrase("Cell 2"));
c1.setHorizontalAlignment(Element.ALIGN_CENTER);
table.addCell(c1);
c1 = new PdfPCell(new Phrase("Cell 3"));
c1.setHorizontalAlignment(Element.ALIGN_CENTER);
table.addCell(c1);
c1 = new PdfPCell(new Phrase("Cell 4"));
c1.setHorizontalAlignment(Element.ALIGN_CENTER);
table.addCell(c1);
subCatPart.add(table);
}
// method to add empty line
private static void addEmptyLine(Paragraph paragraph, int number) {
for (int i = 0; i < number; i++) {
paragraph.add(new Paragraph(" "));
}
}
#Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
getMenuInflater().inflate(R.menu.create_pdf, menu);
return true;
}
#Override
public boolean onOptionsItemSelected(MenuItem item) {
// Handle action bar item clicks here. The action bar will
// automatically handle clicks on the Home/Up button, so long
// as you specify a parent activity in AndroidManifest.xml.
int id = item.getItemId();
if (id == R.id.action_settings) {
return true;
}
return super.onOptionsItemSelected(item);
}
#Override
public void onClick(View v) {
switch (v.getId()) {
case R.id.createBtn:
createPdf();
break;
default:
break;
}
}
}

Lucene ,highlighting and NullPointerException

I am trying to highlight some results . I index the body (the text) of my documents in the field "contents" and when I try to highilight using highlighter.getBestFragment(...) I get a NullPointerException .
But when,for exemple I try to highlight the fileName it works properly.
I know since I use only one field with the fileReader or (ParsingReader) my text is tokenized which is different from a file name .
Here's my code ,please help me .
package xxxxxx;
import java.io.File;
import java.io.FileFilter;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
import org.apache.tika.parser.ParsingReader;
public class Indexer {
static long start = 0;
public static void main(String[] args) throws Exception {
System.out.println("l'index se trouve Ã  " + args[0]);
System.out.println("le dossier ou s'effectue l'indexation est :" + args[1]);
if (args.length != 2) {
throw new IllegalArgumentException("Usage: java " + Indexer.class.getName()
+ " <index dir> <data dir>");
}
String indexDir = args[0];
String dataDir = args[1];
start = System.currentTimeMillis();
Indexer indexer = new Indexer(indexDir);
int numIndexed;
try {
numIndexed = indexer.index(dataDir, new TextFilesFilter());
} finally {
indexer.close();
}
long end = System.currentTimeMillis();
System.out.println("Indexing " + numIndexed + " files took "
+ (end - start) + " milliseconds");
}
private IndexWriter writer;
public Indexer(String indexDir) throws IOException, InterruptedException {
Directory dir = FSDirectory.open(new File(indexDir));
writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), true,
IndexWriter.MaxFieldLength.UNLIMITED);
writer.setUseCompoundFile(true);
}
public void close() throws IOException {
writer.optimize();
writer.close();
}
public int index(String dataDir, FileFilter filter) throws Exception {
File[] files = new File(dataDir).listFiles();
for (File f : files) {
if (!f.isDirectory() && !f.isHidden() && f.exists() && f.canRead() && (filter == null || filter.accept(f))) {
if (!(f.getCanonicalPath().endsWith("~"))) {
indexFile(f);
}
} else {
index(f.toString(), filter);
}
}
return writer.numDocs();
}
private static class TextFilesFilter implements FileFilter {
public boolean accept(File path) {
return true;
}
}
protected Document getDocument(File f) throws Exception {
// FileReader frf = new FileReader(f);
Document doc = new Document();
Reader reader = new ParsingReader(f);
doc.add(new Field("contents", reader, Field.TermVector.WITH_POSITIONS_OFFSETS));
doc.add(new Field("filename", f.getName(), Field.Store.YES, Field.Index.ANALYZED ));
doc.add(new Field("fullpath", f.getCanonicalPath(),Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
return doc;
}
private void indexFile(File f) throws Exception {
System.out.println("Indexing " + f.getCanonicalPath());
Document doc = getDocument(f);
writer.addDocument(doc);
System.out.println(System.currentTimeMillis() - start);
}
}
-------------------------------------------------------------------
package xxxxxxxxxxxxxxxxxxxx;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.queryParser.MultiFieldQueryParser;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.DisjunctionMaxQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleSpanFragmenter;
import org.apache.lucene.search.highlight.TokenSources;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
public class Searcher {
public static void main(String[] args) throws IllegalArgumentException,
IOException, ParseException, InvalidTokenOffsetsException {
System.out.println("endroit ou se situe l'index " + args[0]);
System.out.println(args[1]);
if (args.length != 2) {
throw new IllegalArgumentException("Usage: java "
+ Searcher.class.getName()
+ " <index dir> <query>");
}
String indexDir = args[0];
String q = args[1];
search(indexDir, q);
}
public static void search(String indexDir, String q) throws IOException, ParseException, InvalidTokenOffsetsException {
Directory dir = FSDirectory.open(new File(indexDir));
IndexSearcher indexSearcher = new IndexSearcher(dir);
QueryParser parserC = new QueryParser(Version.LUCENE_30, "contents", new StandardAnalyzer(Version.LUCENE_30));
// QueryParser parserN = new QueryParser(Version.LUCENE_30, "filename", new StandardAnalyzer(Version.LUCENE_30));
QueryParser parserP = new QueryParser(Version.LUCENE_30, "fullpath", new StandardAnalyzer(Version.LUCENE_30));
parserC.setDefaultOperator(QueryParser.Operator.OR);
// parserN.setDefaultOperator(QueryParser.Operator.OR);
parserC.setPhraseSlop(10);
// parserN.setPhraseSlop(10);
DisjunctionMaxQuery dmq = new DisjunctionMaxQuery(6);
Query query = new MultiFieldQueryParser(Version.LUCENE_30, new String[]{"contents", "filename"},
new CustomAnalyzer()).parse(q);
Query queryC = parserC.parse(q);
//Query queryN = parserN.parse(q);
dmq.add(queryC);
//dmq.add(queryN);
// dmq.add(query) ;
QueryScorer scorer = new QueryScorer(dmq, "contents");
Highlighter highlighter = new Highlighter(scorer);
highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
System.out.println(query.toString());
long start = System.currentTimeMillis();
TopDocs hits = indexSearcher.search(dmq, 15);
System.out.println(hits.totalHits);
long end = System.currentTimeMillis();
System.err.println("Found " + hits.totalHits
+ " document(s) (in " + (end - start)
+ " milliseconds) that matched query '"
+ q + "':");
for (ScoreDoc scoreDoc : hits.scoreDocs) {
Document doc = indexSearcher.doc(scoreDoc.doc);
System.out.print(scoreDoc.score);
System.out.println(doc.get("fullpath"));
String contents = doc.get("contents"); // I am pretty sure the mistake is here , contents is always Null
//But what can I do to make this thing work ?
TokenStream stream =
TokenSources.getAnyTokenStream(indexSearcher.getIndexReader(),
scoreDoc.doc,
"contents",
doc,
new StandardAnalyzer(Version.LUCENE_30));
String fragment =
highlighter.getBestFragment(stream, contents);
System.out.println(fragment);
}
indexSearcher.close();
}
}
----------------------------------------------------------------------

You need it to be stored if you want to use that highlighter. "filename" is stored but "contents" isn't, which is why you see them behaving differently:
doc.add(new Field("contents", reader, Field.TermVector.WITH_POSITIONS_OFFSETS));
doc.add(new Field("filename", f.getName(), Field.Store.YES, Field.Index.ANALYZED ));

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

FlyingSaucerPdf , Greek alphabet, Helvetica - pdf

Related

How can I add source PDF content to destination PDF using iText 7 without losing the header and footer?

Add a watermark on a pdf that contains images using pdfbox (1.7)

How to generate a valid PDF/A file using iText and XMLWorker (HTML to PDF/A process)

create or fill a pdf with monodroid

Lucene ,highlighting and NullPointerException

Categories

Resources