Flying Saucer ignores embedded font when generating PDF - pdf

I try to generate PDF using Flying Saucer for the following HTML:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<style type="text/css">
#font-face {
font-family: 'Roboto';
font-style: normal;
font-weight: 300;
src:url(data:application/font-woff;charset=utf-8;base64,d09GRgABAAAAAGSwABMAAAAAtfwAAQAAAAA...zubbzBiN9B2+6bK8AAAABV9JwXgAA) format('woff');
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}
body {
font-family: Roboto;
font-size: 26px;
font-weight: 300;
letter-spacing: -.03em;
}
</style>
</head>
<body>
<p>The quick brown fox jumps over the lazy dog</p>
</body>
</html>
where the base64 font is taken from https://gist.github.com/abelaska/9c9eda70d31315f27a564be2ee490cf4
ITextRenderer renderer = new ITextRenderer();
renderer.setDocumentFromString(data);
renderer.layout();
renderer.createPDF(os);
When I check the fonts being used in the Properties from Adobe Reader, Times New Roman is listed instead of the font above.
If I use path to the font instead in the css, the PDF correctly shows the font.
src:url(/usr/local/Roboto.woff)
Can someone let me know what I am missing, or is this Flying Saucer limitation?

Flying saucer uses iText (but not the latest version) under the hood.
A lot of font and HTML related issues were solved in iText7, that was actually a large part of why we moved to iText7 in the first place.
Try pdfHTML. It's an iText7 add-on that allows you to convert HTML5 (+CSS3) to PDF. It's AGPL licensed and open source. (Or rather, we are currently in the process of open sourcing it).
https://itextpdf.com/itext7/pdfHTML
This is some example code:
// load license
LicenseKey.loadLicenseFile("path_to_license_key.xml");
// conversion
HtmlConverter.convertToPdf(
"<b>This text should be written in bold.</b>",
new PdfWriter(new File("C://users/user5733033/output.pdf"))
);

Related

Changing font in datatables pdfmaker extension

I have googled enough for a whole one day, and searched StackOverflow to find a solution for changing the font in pdf exports of dataTables. However, I didn't run into a solution. When I export the table into pdf script fonts are something undecipherable. Just look at the picture below. It shows two columns from my table.
Both dataTables forum and pdfMaker documentations are vague. Can anyone please help me out of the problem. I need to specify a font for pdfMaker extension of datatables.
The following is what vfs_fonts.js looks like:
this.pdfMake = this.pdfMake || {}; this.pdfMake.vfs = {
pdfMake.fonts = {
Vazir: {
normal: 'Vazir-FD.ttf',
bold: 'Vazir-FD.ttf',
italics: 'Vazir-FD.ttf',
bolditalics: 'Vazir-FD.ttf'
}
};
}
The following is also my buttons block of my datatables:
buttons: [
{ extend: 'pdfHtml5', exportOptions:
{ columns: [0, 1, 2, 3, 4, 5, 6] },
customize: function (doc) {
doc.defaultStyle.font = Vazir},
},
]
Note that in the above block of code, when I add 'customize' block, the pdfMaker button won't prepare any pdf reports; without it, it works, however, the fonts are undecipherable.
Thanks in advance.
Here is a solution.
The DataTable Code
The HTML is as follows:
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Export to PDF</title>
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
<script src="https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js"></script>
<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/1.10.20/css/jquery.dataTables.min.css">
<link rel="stylesheet" type="text/css" href="https://datatables.net/media/css/site-examples.css">
<!-- buttons -->
<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/buttons/1.6.1/css/buttons.dataTables.min.css">
<script src="https://code.jquery.com/jquery-3.3.1.js"></script>
<script src="https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.1/js/dataTables.buttons.min.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.1/js/buttons.flash.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jszip/3.1.3/jszip.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/pdfmake/0.1.53/pdfmake.min.js"></script>
<!--
<script src="https://cdnjs.cloudflare.com/ajax/libs/pdfmake/0.1.53/vfs_fonts.js"></script>
Build a custom local version of the vfs_fonts.js file, containing whatever fonts you need. The
structure of the file is this:
this.pdfMake = this.pdfMake || {}; this.pdfMake.vfs = {
"arial.ttf": "AAEAA...MYXRu",
"another_one.ttf": "XXXX...XXXX"
};
Replace the "AAEAA...MYXRu" with the base64-encoded string of your font file.
You can use this site to generate the string: https://dataurl.sveinbjorn.org/#dataurlmaker
-->
<script src="vfs_fonts.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.1/js/buttons.html5.min.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.1/js/buttons.print.min.js"></script>
</head>
<body>
<div style="margin: 20px;">
<table id="example" class="display nowrap dataTable cell-border" style="width:100%">
<thead>
<tr>
<th>Name</th>
<th>Data</th>
</tr>
</thead>
<tbody>
<tr>
<td>Adélaïde Nixon</td>
<td><font face="verdana">الفبای فارسی ۱۲۳۴</font></td>
</tr>
</tbody>
<tfoot>
<tr>
<th>Name</th>
<th>Data</th>
</tr>
</tfoot>
</table>
</div>
<script type="text/javascript">
$(document).ready(function() {
$('#example').DataTable({
dom: 'Bfrtip',
buttons: [{
extend: 'pdf',
customize: function ( doc ) {
processDoc(doc);
}
}]
});
});
function processDoc(doc) {
//
// https://pdfmake.github.io/docs/fonts/custom-fonts-client-side/
//
// Update pdfmake's global font list, using the fonts available in
// the customized vfs_fonts.js file (do NOT remove the Roboto default):
pdfMake.fonts = {
Roboto: {
normal: 'Roboto-Regular.ttf',
bold: 'Roboto-Medium.ttf',
italics: 'Roboto-Italic.ttf',
bolditalics: 'Roboto-MediumItalic.ttf'
},
arial: {
normal: 'arial.ttf',
bold: 'arial.ttf',
italics: 'arial.ttf',
bolditalics: 'arial.ttf'
}
};
// modify the PDF to use a different default font:
doc.defaultStyle.font = "arial";
var i = 1;
}
</script>
</body>
The DataTable as a Web Page
The above HTML produces the following web page:
The PDF File
When you click on the "Save as PDF" button, the PDF document looks like this:
How to Implement
As explained here, pdfMake uses the Roboto font by default. This font does not support Persian characters/script. To work around this, I changed the default font to Arial, which does provide support for Persian characters/script.
Please see the additional notes at the end regarding the use of Arial - another font may be more appropriate to avoid licensing issues.
To make this change I took the following steps:
I generated a new vfs_fonts.js file, containing the contents of an arial TTF file. I also refer to this new local vfs_fonts.js file, instead of the Cloudflare version.
The vfs_fonts.js file has the following structure:
this.pdfMake = this.pdfMake || {}; this.pdfMake.vfs = {
"arial.ttf": "AAEAA...MYXRu",
"another_one.ttf": "XXXX...XXXX"
};
Each of the "AAEAA...MYXRu strings is the base64-encoded representation of the related font file.
To generate the string for your TTF file, you can use the utilities provided by pdfmake (see below), or you can use any base64 encoder. One example is dataurlmaker.
Paste the (very long) string generated by dataurlmaker into your vfs_fonts.js file. Do NOT include any preamble provided by dataurlmaker ("data:application/octet-stream;base64,"). Include only the base64 string itself.
Alternatively...
Using the tools provided by pdfmake:
To generate this new vfs_fonts.js file, I followed the relevant instructions on this page.
(a) I already had npm installed.
(b) I ran npm install pdfmake
(c) I changed to the pdfmake installation directory:
C:\Users\<myUserID>\node_modules\pdfmake\
(d) I created the examples/fonts subdirectory in my pdfMake directory.
(e) I copied my Windows arial.ttf file into this new fonts directory.
(f) I ran npm install (from the pdfMake directory) to ensure all prerequisites modules were installed.
(g) I installed gulp using npm install gulp --global
(h) I ran gulp buildFonts to create a new build/vfs_fonts.js.
(i) I included this new build/vfs_fonts.js file in my web page code (as shown above).
After taking these steps, I was able to generate a PDF using the Arial font.
Update
Please read the comments provided by #anotherfred for some important notes:
Regarding the specific use of Arial (emphasis is mine):
Note that Arial's licence may forbid this. Fonts like Noto Sans are free international fonts but you have to carefully choose the version to get the languages you want.
You can use online tools such as Google Fonts and Font Squirrel to find fonts which match your language & character/glyph requirements.
Regarding how to reference your chosen font file(s):
Also, to avoid having to set the default font in datatables options, you can just name your key in pdfMake.fonts Roboto (whatever ttf files you actually use in it) and it will be used automatically.
It would be great if the following could be usd out-of-the-box in a future version of DataTables (with an upgraded version of pdfmake)
You can also use a font url instead of vfs_fonts, but this requires a newer version of pdfMake than datatables suggest.

Can't get Indian Rupees symbol to show up in PDF generated with Flying Saucer

I'm trying a few different ways, but I can't get a pdf generated with Flying Saucer (from an html file) to show the unicode character for Indian Rupees - "₹"
This is what I have currently:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<style>
body {
font-family: Arial Unicode MS, Lucida Sans Unicode, Arial, verdana, arial, helvetica, sans-serif;
}
#font-face {
font-family: 'Arial Unicode MS';
src: url(arialunicodems.ttf);
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: UTF-8;
-fs-pdf-font-encoding: Identity-H;
font-weight: normal;
}
</style>
</head>
<body>
<p>We want to see a Indian Rupees symbol between the asterisks on one or more of these lines, in the PDF (if any of the symbols make it through to the PDF then we're good):</p>
<p>Using the glyph itself in the markup: * ₹ *</p>
<p>Using &#x20B9; in the markup: * ₹ *</p>
<p>Using &#8377; in the markup: * ₹ *</p>
</body>
</html>
which represents lots of different experiments, none of which have worked. The font file it refers to is sitting next to the html file version of the above.
The font itself seems to be being loaded, in that the text in the pdf file looks like Arial. It's just missing the Rupees symbol. I don't know what else to do - i'm pulling in a unicode font, and the html file itself looks fine, when viewed in the browser. When I print it out of chrome it looks fine too, so the problem is definitely with flying saucer I think.
I'm using Flying Saucer as follows:
/usr/bin/java -Djava.awt.headless=true -cp .:$FS_PATH/acts_as_flying_saucer/lib/java/bin:$FS_PATH/acts_as_flying_saucer/lib/java/jar/minium.jar:$FS_PATH/acts_as_flying_saucer/lib/java/jar/itext-paulo-155.jar:$FS_PATH/acts_as_flying_saucer/lib/java/jar/core-renderer.jar:$FS_PATH/acts_as_flying_saucer/lib/java/jar/java-getopt-1.0.13.jar Xhtml2Pdf /home/max/font_test.html /home/max/font_test.pdf
Can anyone see if I'm doing anything wrong?
I'm answering my own question here in case anyone else makes the same mistake. The answer turned out to be really simple - it's not in the font! Turns out that the "₹" symbol was only invented in 2010, and so is not present in a lot of Unicode font files, including the one I used.
It worked in the browser because the browser (Chrome) was automatically looking for it in other character sets (without me explicitly asking it to), and found it in Deja Vu Sans as it happens (the fallback for Linux Chromium).
I changed my code to use the older (but still acceptable) "₨" symbol, but a more proper fix would be to include a font that actually has the modern Rupees symbol.

Embed font in PDF rendering plugin in Grails

I want to embed 'HelveticaNeueLTCom-BdCn.ttf' in a PDF document. I'm using Grails rendering 0.4.4 Plugin to generate PDF file.
I tried following,
#font-face {
font-family: 'Helvetica';
src: url('${grailsApplication.config.grails.serverURL}/fonts/HelveticaNeueLTCom-BdCn.ttf');
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}
but it doesn't work.
The font embedding requires the below steps to be followed. This worked for me.
Try and tell me your feedback
The PdfRenderingService class present inside the plugin should be
edited for this font simulation as below.
protected doRender(Map args, Document document, OutputStream outputStream)
{
def renderer = new ITextRenderer()
// add the real font path from the server to be deployed to.
//I have it in the assets folder of my project
def path=servletContext.getRealPath("/")+"assets/HelveticaNeueLTCom-BdCn.ttf"
ITextFontResolver fontResolver=renderer.getFontResolver();
//add the encoding and embedded types to the font
fontResolver.addFont(path,BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
configureRenderer(renderer)
renderer.setDocument(document, args.base)
renderer.layout()
renderer.createPDF(outputStream)
outputStream.close();
}
Add the below code in your template file
#font-face {
font-family: "Helvetica";
src: url("${grailsApplication.config.grails.serverURL}/assets/HelveticaNeueLTCom-BdCn.ttf") format("truetype"),
url("${grailsApplication.config.grails.serverURL}/assets/HelveticaNeueLTCom-BdCn.woff") format("woff"),
url("${grailsApplication.config.grails.serverURL}/assets/HelveticaNeueLTCom-BdCn.svg#HelveticaNeueLTCom-BdCn") format("svg");
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}
#font-face { src:url(${grailsApplication.config.app.serverUrl}/arialuni.ttf) ; -fs-pdf-font-embed: embed; -fs-pdf-font-encoding: Identity-H; }
This worked for me.
Probably the problem is that you have your url value surrounded by ' instead of ".
The difference between them is that, though in Groovy string literals can be made with both, only the ones surrounded by " create GString, which evaluates statements between ${}
This worked for me
#font-face {
src: url("path/to/KF-Kiran.ttf");
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: cp1250;
}
div {
font-family: 'KF-Kiran'; // here give the same name of .ttf file.
}

epub.js not loading properly on IE11

I'm trying to load an epub on my page using epub.js library and its not working on IE 11, it works perfrectly on chrome and Firefox though.
I'm not getting a script error, I don't get a message in the console log, fiddler says all scripts (including zip.js and my epub) are downloaded properly.
It just doesn't load, the iframe embedded has a src="" property and an empty html body. as in the following snapshot.
Here is my html page content:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<script src="content/epubjs/epub.js"></script>
<script src="content/epubjs/libs/zip.min.js"></script>
</head>
<body>
<span onclick="Book.prevPage();">Prev</span>
<span onclick="Book.nextPage();">Next</span>
<div style="height: 700px; border: 5px solid red" id="area"></div>
<script type="text/javascript">
EPUBJS.filePath = "content/epubjs/libs/";
</script>
<script type="text/javascript">
var Book = ePub("content/aliceDynamic.epub", {
version: 4,
restore: false, // Skips parsing epub contents, loading from localstorage instead
storage: false, // true (auto) or false (none) | override: 'ram', 'websqldatabase', 'indexeddb', 'filesystem'
spreads: false, // Displays two columns
fixedLayout: true, //-- Will turn off pagination
styles: {}, // Styles to be applied to epub
width: false,
height: '700px'
});
Book.renderTo("area");
</script>
</body>
</html>
I tried to play around with the options parameter, set things to false and true here and there but it didn't help.
It looks like it is a problem with the current version of epub.js and internet explorer 11. If you try and load the moby dick page you should see the same problem.
Try setting a break on all exceptions (even handled ones) in the javascript engine of IE, and you will see that the javascript throws an exception saying that "'XPathResult' is undefined".
Common recommendations to correct that seem to be installing the wicked-good-xpath library in order to sidestep ie11 lack of XPath support. Install the library and initialize it before trying to load you epub.
If this doesn't correct your problem, you may have to wait until the issues are solved since you don't seem to be the only person who encounters it.

DXFilter is somehow still working in IE10?

This standalone example has a DXFilter to render a gradient, it renders in quirks mode. IE10 has 'show legacy filters' set to off, I see it in the 'internet' zone. I still see the gradient?
from: http://msdn.microsoft.com/en-us/library/ie/hh801215(v=vs.85).aspx
"DirectX-based Filters and Transitions (DX filters) are obsolete in Internet Explorer 10 for webpages in the Internet Zone. "
Why does this work?
<!-- Comment before Doctype to force quirks mode in IE6/7 -->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head><meta http-equiv="X-UA-COMPATIBLE" content="IE=5">
</head>
<style type="text/css" >
.SomeDiv
{
WIDTH: 50px;
HEIGHT: 50px;
FILTER: progid:DXImageTransform.Microsoft.Gradient(GradientType=1, StartColorStr='#00ff00', EndColorStr='#ff0000');
}
</style>
<div class='SomeDiv'>
Hi
</div>
</html>
Obsolete does not mean removed. In this case, there are two reasons:
The comment before the doctype triggers IE5 quirksmode
The site is running in the Intranet Zone or Trusted Sites Zone
If it is inconsistently appearing in the Internet Zone, there are two reasons:
End-users can change these settings (for these document modes only) by using Internet Options to change the security settings for the zone in question. Administrators can also use Group Policy.