trouble using xhtml2pdf with unicode - pdf

I've been trying to convert Hebrew html files without success; the Hebrew characters show up in the output PDF as black rectangles regardless of any encoding I tried.
I tried some unicode test files included in the pisa distribution: pisa-3.0.33\test\test-unicode-all.html and \test-bidirectional-text.html . I ran xhtml2pdf from the command line both with and without --encoding utf-8. Same result: none of the non-Latin characters made it through.
Is this a fonts problem*? If the unicode test file works for you, was there anything you did to set it up?
*FWIW, at least some of these languages, including Hebrew, should work with Arial.
EDIT: Alternatively, if someone has pisa set up and could try converting the unicode test file above, I would be very grateful.

Inserting following code into html helped me
<style>
#page {
size: a4;
margin: 0.5cm;
}
#font-face {
font-family: "Verdana";
src: url("verdana.ttf");
}
html {
font-family: Verdana;
font-size: 11pt;
}
</style>
in url instead of "verdana.ttf" you should put absolute path to font in your os

If anyone in the future tries, like me, to figure out how to PROPERLY create a PDF file that contains Hebrew using xhtml2pdf, here's what worked for me:
First thing: including the fonts settings as described here by #eviltrue in my HTML. This can be any font as long as it supports Hebrew characters, otherwise any Hebrew characters in the input HTML would simply appear as black rectangles in the PDF.
At the time of writing this answer, while it is possible to output Hebrew characters to PDF in xhtml2pdf, Hebrew characters are outputted in revers order, i.e. שלום כיתה א
would be א התיכ םולש.
At this point I was stuck, but then I stumbled upon this SO asnwer:
https://stackoverflow.com/a/15449145/1918837
After installing the python-bidi package, here is an example of a complete solution (used in a python app):
from bidi import algorithm as bidialg
from xhtml2pdf import pisa
HTMLINPUT = """
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<style>
#page {
size: a4;
margin: 1cm;
}
#font-face {
font-family: DejaVu;
src: url(my_fonts_dir/DejaVuSans.ttf);
}
html {
font-family: DejaVu;
font-size: 11pt;
}
</style>
</head>
<body>
<div>Something in English - משהו בעברית</div>
</body>
</html>
"""
pdf = pisa.CreatePDF(bidialg.get_display(HTMLINPUT, base_dir="L"), outpufile)
# I'm using base_dir="L" so that "< >" signs in HTML tags wouldn't be
flipped by the bidi algorithm
The nice thing about the bidi algorithm is that you can have mixed RTL and LTR languages in the same line (like in the HTML example above) and still have a correctly formatted result.
EDIT:
The best way to go now is definitely using wkhtmltopdf

Related

docx4j html to pdf word-break issue

<html>
<head>
<style>
p {
word-break: break-all;
}
</style>
</head>
<body style="width: 500px">
<p>
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa</p>
</body>
</html>
the html code is looks this ↑
This is the effect of html in the browser [Click to view]
#Test
void contextLoads() throws Docx4JException, FileNotFoundException, MalformedURLException {
File file = new File("C:\\Users\\zx\\Desktop\\data2.html");
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert(file.toURI().toURL()));
Docx4J.toPDF(wordMLPackage, new FileOutputStream("C:\\Users\\zx\\Desktop\\3.pdf"));
}
the java code is looks this ↑
This is the effect of pdf in the browser [Click to view]
I want him to install the CSS style as shown in the following figure, and return the line
The html importer that comes with docx4j can't handle a separated css file.
The styles should be inline in the html tags for being considered, in a style property.
And for inserting a break line, and move to the next line, in pdf it should be another text box, therefor you have to create a paragraph for each line you want, by measuring how many characters you can place in a single line.

Why won’t both of my html stuff work together

I am trying to decorate my background of my website I am building and for some reason I can put one or the other by themselves work but when I add both lines then only the top one works. How can I make both lines work together.
<body style=background-color:powderblue>
<body style=border-style:solid;border-color:red>
You can only have one <body> tag.
<body style="background-color:powderblue;border-style:solid;border-color:red">
Combine the styles into one, or move the styles to a css file or <style> block in your <head>
body {
background-color:powderblue;
border-style:solid;
border-color:red
}

Rotate text (90º degree - vertical rotation) in Odoo 10 QWeb PDF report

I need to rotate text by 90º (to display it vertically) in a custom QWeb PDF report.
Could someone paste an specific CSS and HTML example to do so?
(Odoo 10)
Thanks
Here is an example that works in Odoo 11:
<div style="transform: rotate(90deg); -webkit-transform: rotate(90deg);">Rotated Text</div>
I had much trouble with it myself, because transform: rotate(90deg); rotated the text in the preview, but it did not print so. The version of wkhtmltopdf used in Odoo 11 does not support this CSS property unless prefixed (and wkthmltopdf uses a Webkit rendering engine).
If this does not work in Odoo 10 or previous, you will need to see if you can update, or at least update wkhtmltopdf to a version that supports that property.
Hello usk70,
Definition and Usage
The transform property applies a 2D or 3D transformation to an element. This property allows you to rotate, scale, move, skew, etc., elements.
Syntax
transform: none|transform-functions|initial|inherit;
Property Values
rotate(angle) : Defines a 2D rotation, the angle is specified in the parameter
For Example,
I give the example using html and css3 and try this code in your odoo 10 qweb pdf report.
<!DOCTYPE html>
<html>
<head>
<style>
div {
width: 200px;
height: 100px;
background-color: yellow;
/* Rotate div */
-ms-transform: rotate(90deg); /* IE 9 */
-webkit-transform: rotate(90deg); /* Chrome, Safari, Opera */
transform: rotate(90deg);
}
</style>
</head>
<body>
<div>Hello </div>
<br>
<p><b>Note:</b> Internet Explorer 8 and earlier versions do not support the transform property.</p>
<p><b>Note:</b> Internet Explorer 9 supports an alternative, the -ms-transform property. Newer versions of IE support the transform property (do not need the ms prefix).</p>
<p><b>Note:</b> Chrome, Safari and Opera supports an alternative, the -webkit-transform property.</p>
</body>
</html>
I hope my answer is helpfull.
If any query so comment please.

Is it possible to add a background to the entire height in pdf when it's converted with wkhtmltopdf?

Is it possible to add a background color to the entire height in pdf when it's converted with wkhtmltopdf?
This css rule:
html,body {
height: 100%;
backgorund-color: #ff0000;
}
doesn't work :)
Converted html is automaticaly generated. It has from 1 to 3 pages.
This works perfectly for me:
#test.html
<html>
<body style="background-color:#E6E6FA">
<h1>Hello world!</h1>
</body>
</html>
And run command -
wkhtmltopdf --margin-bottom 0 --margin-top 0 test.html test1.pdf
For more wkhtmltopdf options check out this manual

Rendering issue for combined font (Japanese & English ) in PDF using cfdocument

I have big trouble with "Combined Fonts" (Japanese & English).
I have to create a PDF document from HTML content which is shown in my website. For that I have used <cfdocument> and implemented the PDF from the HTML content. But my content includes both Japanese & English content and which is appear in a different font in the created PDF than what is on my website. The issue occurred only in the case of combined Japanese & English section.
The requirement is:
For English content, the font should be Verdana.
For Japanese content, the font should be Simson.
I have implemented the same with Korean, Chinese, French and it's working.
For outputting the special characters, I have added <cfprocessingDirective pageEncoding="utf-8"> above the code. But I still get weird font for the contents in both English and Japanese.
The code I have tried is given below,
<cfcontent type="application/pdf">
<cfheader name="Content-Disposition" value="attachment;filename=test.pdf">
<cfprocessingdirective pageencoding="utf-8">
<cfdocument format="PDF" localurl="yes" marginTop=".25" marginLeft=".25" marginRight=".25" marginBottom=".25" pageType="custom" pageWidth="8.5" pageHeight="10.2">
<cfoutput>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>PDF Export Example</title>
<style>
body { font-family: Verdana; }
h1 { font-size: 14px; }
p { font-size: 12px; line-height: 1.25em; margin-left:20px;}
</style>
</head>
<body>
<h1>PDF Export Example Combined Japanese & English</h1>
<p>This is an japanese with english example
日本人は単純な音素配列論で膠着、モーラ·タイミングの言語、純粋な母音システム、
音素の母音と子音の長さ、および語彙的に重要なピッチアクセント。語順は通常、粒子が言葉の文法的機能をマ
ーキング対象オブジェクトと動詞であり、文の構造は、トピック·コメントです。文末粒子は、感情的または強調の影響を追加したり、
質問を作るために使用されます。名詞は文法的に番号や性別を持たず、何の記事はありません。動詞は主に緊張し、音声ではなく、
人のために、コンジュゲートされる。形容詞の日本の同等物は、また、結合している。日本人は動詞の形や語彙、話者の相対的な地位、
リスナーおよび掲げる者を示すと敬語の複雑なシステムを持っています。This is an example.
</p>
<h1>PDF Export English Example</h1>
<p>This is an example.
</p>
</body>
</html>
</cfoutput>
</cfdocument>
What else should I do to fix this problem?
Thank you.
As per the results I have got and research, I have found that there is an issue with PDF style rendering for English with Japanese content.
So finally I found a solution,
Apply space between the Japanese and English words.
Iterate the whole string (list with space delimiter)
Then It is possible to differentiate the English words from the Japanese words by Regular expression
Apply separate style for English words by wrapping them by span (or any) tag.
I don't know whether this is the proper solution for this issue. This is what I have done for solving the issue.