docx4j html to pdf word-break issue - docx4j

<html>
<head>
<style>
p {
word-break: break-all;
}
</style>
</head>
<body style="width: 500px">
<p>
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa</p>
</body>
</html>
the html code is looks this ↑
This is the effect of html in the browser [Click to view]
#Test
void contextLoads() throws Docx4JException, FileNotFoundException, MalformedURLException {
File file = new File("C:\\Users\\zx\\Desktop\\data2.html");
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert(file.toURI().toURL()));
Docx4J.toPDF(wordMLPackage, new FileOutputStream("C:\\Users\\zx\\Desktop\\3.pdf"));
}
the java code is looks this ↑
This is the effect of pdf in the browser [Click to view]
I want him to install the CSS style as shown in the following figure, and return the line

The html importer that comes with docx4j can't handle a separated css file.
The styles should be inline in the html tags for being considered, in a style property.
And for inserting a break line, and move to the next line, in pdf it should be another text box, therefor you have to create a paragraph for each line you want, by measuring how many characters you can place in a single line.

Related

Why won’t both of my html stuff work together

I am trying to decorate my background of my website I am building and for some reason I can put one or the other by themselves work but when I add both lines then only the top one works. How can I make both lines work together.
<body style=background-color:powderblue>
<body style=border-style:solid;border-color:red>
You can only have one <body> tag.
<body style="background-color:powderblue;border-style:solid;border-color:red">
Combine the styles into one, or move the styles to a css file or <style> block in your <head>
body {
background-color:powderblue;
border-style:solid;
border-color:red
}

C# Selenium Webdriver (Firefox) iFrame does not allow text to be entered via sendKeys

I'm using latest Selenium Firefox (2.53.0)
Previously code was working when performing the following
1) Finding the iFrame by Xpath iframe class
IWebElement detailFrame = `Driver_Lib.Instance.FindElement(By.XPath("//iframe[#class='cke_wysiwyg_frame cke_reset']"));`
2) Switching to that frame by
Driver_Lib.Instance.SwitchTo().Frame(detailFrame);
3) finding the p tag within the iFrame by
IWebElement freeText = Driver_Lib.Instance.FindElement(By.TagName("p"));
4) Inserting a simple string to the iframe text box
freeText.SendKeys("this is some text");
5) switching from the iFrame back to the main contentwindow by
Driver_Lib.Instance.SwitchTo().DefaultContent();
Here is the code part from the application
<iframe class="cke_wysiwyg_frame cke_reset" frameborder="0" src="" style="width: 100%; height: 100%;" title="Rich Text Editor, ctl00_ctl00_MainContentPlaceHolder_PageContent_mlcEditor_CKEditor" aria-describedby="cke_61" tabindex="0" allowtransparency="true">
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<title data-cke-title="Rich Text Editor, ctl00_ctl00_MainContentPlaceHolder_PageContent_mlcEditor_CKEditor">Rich Text Editor, ctl00_ctl00_MainContentPlaceHolder_PageContent_mlcEditor_CKEditor</title>
<style data-cke-temp="1">
<link href="https://myUrl/contents.css" rel="stylesheet" type="text/css">
<style data-cke-temp="1">
</head>
<body class="cke_editable cke_editable_themed cke_contents_ltr cke_show_borders" contenteditable="true" spellcheck="false">
<p>
<br _moz_editor_bogus_node="TRUE">
</p>
</body>
</html>
</iframe>
The test I am running is a simple one, open up that page, insert some text, save.
It not inserting the text into the iFrame. I am totally puzzled as to why.
Has anyone else found this issue at all?
Many thanks
I have removed the exception, this was a redHerring.
the iFrame can not have text entered into it
hi all I've found the solution:~ here is the summary of what was happening:
1) The iFrame was being located by xPath.
2) the SwitchTo() method used placed focus in the detailFrame instance of IWebElement
3) What was not happening was the p tag could not be located as it was contained withing a CSS Body Class that.
The solution was staring me in the face the whole time! so simple!!
I did this:
IWebElement detailFrame = Driver_Lib.Instance.FindElement(By.XPath("//iframe[#class='cke_wysiwyg_frame cke_reset']"));
Driver_Lib.Instance.SwitchTo().Frame(detailFrame);
IWebElement freeText = Driver_Lib.Instance.FindElement(By.TagName("body"));
freeText.SendKeys("This is a free text question created by Automation Smoke Test");
Driver_Lib.Instance.SwitchTo().DefaultContent();
So as you see, simply locating the 1st instance of the body tag!

Selenium not able to find element by tag name "body" (only for IE)

HTML CODE
<html>
<head>
<body style="padding: 10px 25px; margin:0; left:0;right:0;top:0;bottom:0;position:absolute;font:14px 'robotoregular'; cursor:text; width: auto;">
<br _moz_editor_bogus_node="TRUE"/>
</body>
</html>
JAVA CODE
public TemplateOfNewLetter enterTextMessageToMessageField(String textMessage){
driver.switchTo().frame(0);
//driver.switchTo().frame(driver.findElement(By.tagName("iframe")));
//WebElement body = waitElementToBeClickable(By.cssSelector("html>body"));
WebElement body = waitElementToBeClickable(By.tagName("body"));
body.click();
body.sendKeys(textMessage);
driver.switchTo().defaultContent();
return this;
}
I tried to use the code above but the issue is still reproduced in explorer (for FF and Chrome tests passed)
Please advice how can I enter text to the text message field
you didn't have frame to switch to.
you can't send text to body.
Try to look how to use selenium in this link:
http://seleniumeasy.com/selenium-webdriver-tutorials

Is it possible to add a background to the entire height in pdf when it's converted with wkhtmltopdf?

Is it possible to add a background color to the entire height in pdf when it's converted with wkhtmltopdf?
This css rule:
html,body {
height: 100%;
backgorund-color: #ff0000;
}
doesn't work :)
Converted html is automaticaly generated. It has from 1 to 3 pages.
This works perfectly for me:
#test.html
<html>
<body style="background-color:#E6E6FA">
<h1>Hello world!</h1>
</body>
</html>
And run command -
wkhtmltopdf --margin-bottom 0 --margin-top 0 test.html test1.pdf
For more wkhtmltopdf options check out this manual

trouble using xhtml2pdf with unicode

I've been trying to convert Hebrew html files without success; the Hebrew characters show up in the output PDF as black rectangles regardless of any encoding I tried.
I tried some unicode test files included in the pisa distribution: pisa-3.0.33\test\test-unicode-all.html and \test-bidirectional-text.html . I ran xhtml2pdf from the command line both with and without --encoding utf-8. Same result: none of the non-Latin characters made it through.
Is this a fonts problem*? If the unicode test file works for you, was there anything you did to set it up?
*FWIW, at least some of these languages, including Hebrew, should work with Arial.
EDIT: Alternatively, if someone has pisa set up and could try converting the unicode test file above, I would be very grateful.
Inserting following code into html helped me
<style>
#page {
size: a4;
margin: 0.5cm;
}
#font-face {
font-family: "Verdana";
src: url("verdana.ttf");
}
html {
font-family: Verdana;
font-size: 11pt;
}
</style>
in url instead of "verdana.ttf" you should put absolute path to font in your os
If anyone in the future tries, like me, to figure out how to PROPERLY create a PDF file that contains Hebrew using xhtml2pdf, here's what worked for me:
First thing: including the fonts settings as described here by #eviltrue in my HTML. This can be any font as long as it supports Hebrew characters, otherwise any Hebrew characters in the input HTML would simply appear as black rectangles in the PDF.
At the time of writing this answer, while it is possible to output Hebrew characters to PDF in xhtml2pdf, Hebrew characters are outputted in revers order, i.e. שלום כיתה א
would be א התיכ םולש.
At this point I was stuck, but then I stumbled upon this SO asnwer:
https://stackoverflow.com/a/15449145/1918837
After installing the python-bidi package, here is an example of a complete solution (used in a python app):
from bidi import algorithm as bidialg
from xhtml2pdf import pisa
HTMLINPUT = """
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<style>
#page {
size: a4;
margin: 1cm;
}
#font-face {
font-family: DejaVu;
src: url(my_fonts_dir/DejaVuSans.ttf);
}
html {
font-family: DejaVu;
font-size: 11pt;
}
</style>
</head>
<body>
<div>Something in English - משהו בעברית</div>
</body>
</html>
"""
pdf = pisa.CreatePDF(bidialg.get_display(HTMLINPUT, base_dir="L"), outpufile)
# I'm using base_dir="L" so that "< >" signs in HTML tags wouldn't be
flipped by the bidi algorithm
The nice thing about the bidi algorithm is that you can have mixed RTL and LTR languages in the same line (like in the HTML example above) and still have a correctly formatted result.
EDIT:
The best way to go now is definitely using wkhtmltopdf