Rendering of XHTML to PDF is slow on FlyingSaucer with OpenPDF library

Rendering of XHTML to PDF is slow on FlyingSaucer with OpenPDF library - flying-saucer

Any chance to improve XHTML to PDF rendering speed in FlyingSaucer /OpenPDF ? I am using version 9.1.20 of flying-saucer-pdf-openpdf library. Simple XML took almost 5 seconds. Took 0.5 sec with the flying-saucer-pdf lib.
org.w3c.dom.Document doc = parseXMLContent(xhtmlContent, validate);
Date tim = new Date();
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(doc, null);
renderer.layout();
renderer.createPDF(out);
renderer.finishPDF();
log.debug("xhtml2pdf took " + ((new Date()).getTime() - tim.getTime()) / 1000.0 + " seconds");
XML
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<h1>TEST</h1>
</body>
</html>

Related

Match html response in karate [duplicate]

This question already has an answer here:
Karate: Match repeating element in xml
(1 answer)
Closed 2 years ago.
I hava a problem in matching my response errors with html.
I tried like this
match $.errors == '#present'
match $.errors == response
Errors:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>Error: Unexpected object!</pre>
</body>
</html>
I'm doing it like this and the scnario will be stoped!
When method post
* if (responseStatus == 500 ) karate.abort()
Then status 200
* match $.errors == '#notpresent'
How can I do to get the response match as html text?

Sorry Karate only works with well-formed XML. You can try to replace content in the HTML to clean it up. Or you can just do string contains matches etc. Or you can write some JS or Java code for custom checks.
This will work (after removing the <meta> tag which is not well-formed.
* def response =
"""
<!DOCTYPE html>
<html lang="en">
<head>
<title>Error</title>
</head>
<body>
<pre>Error: Unexpected object!</pre>
</body>
</html>
"""
* match //pre == 'Error: Unexpected object!'

Export Html to Pdf using JsReport in Asp.net core

I have a html page with images, tables and some styling with Bootstrap 4. I tried to convert the pages to pdf using the JsReportMVCService, the pdf doesnot load with the proper css class from bootstrap.
HTML CONTENT
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width" />
<title>WeekelyReport</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css">
</head>
<body>
<div class="jumbotron">
<h1> Hello John Doe,</h1>
<p>
This is a generic email about something.<br />
<br />
</p>
</div>
</body>
</html>
ASP.NET CORE IMPLEMENTATION
var generatedFile = await GeneratePDFAsync(htmlContent);
File.WriteAllBytes(#"C:\temp\hello.pdf", generatedFile);
async Task<byte[]> GeneratePDFAsync(string htmlContent)
{
var report = await JsReportMVCService.RenderAsync(new RenderRequest()
{
Template = new Template
{
Content = htmlContent,
Engine = Engine.None,
Recipe = Recipe.ChromePdf
}
});
using (var memoryStream = new MemoryStream())
{
await report.Content.CopyToAsync(memoryStream);
return memoryStream.ToArray();
}
}
How my Pdf Looks after the conversion to PDF.
It is possible to convert to pdf with the same bootstrap 4 layout? or am i missing something during the conversion here?

The pdf printing uses print media type and the bootstrap has quite different styles for printing. This causes that the pdf looks different than html, but it looks the same as you would print it. I would generally not recommend using responsive css framework as bootstrap for printing static pdf, but it is of course your choice.
To make your example looking the same on pdf you just need to change the media type on the chrome settings.
var report = await JsReportMVCService.RenderAsync(new RenderRequest()
{
Template = new Template
{
Content = htmlContent,
Engine = Engine.None,
Recipe = Recipe.ChromePdf,
Chrome = new Chrome {
MediaType = MediaType.Screen,
PrintBackground = true
}
}
});
make sure you have the latest jsreport.Types#2.2.2

About how to refer XHMTL file converted PDF file with CPF(Content Processing Framework) from browser

I'd like to reference XHTML files converted from PDF files using MarkLogic's Content Processing Framework(Pipeline:PDF Conversion(Page Layout)) from the browser.
Although it could be confirmed that the XHTML file can be displayed in the browser via the HTTP server with the following code, the link of css or jpeg file referenced from the XHTML file is not valid and can not be displayed correctly.
Does anyone know how to solve this problem?
My Code(index.xqy):
declare variale $uri := "/aaa/bbb/ccc_pdf.xhmlt";
xdmp;set-response-content-type("text/html;charset=uft-8")
'<!DOCTYPE html PUBLI "-//W3c//DTD XHTML 1.0
Strict//EN "http://www.w3.org/TR/xhtml1-strict.dtd">,
<html xmlns="http://www.w3.rog/1999/xhtml">
<body>
<iframe src="get-file.xqy?uri={xdmp:uri-encode($uri)}">
</iframe>
</body>
</html>
My Code(get-file.xqy):
let $uri := xdmp:get-request-field("uri")
let $mimetype := xdmp:uri-content-type($uri)
return
if(fn:doc($uri))
then (
xdmp:set-response-content-type($mimetype),
fn:doc($uri)
)
else ()

Detect broken images in webbrowser control document?

Is there any way to detect whether or not an image has not loaded/is broken in a webbrowser control? I am loading html from a file like so:
Here is some html:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META content="text/html; charset=unicode" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 11.00.10586.589">
</HEAD>
<BODY>
<A href="https://web.archive.org/web/20120124023601/http://www.flatfeets.com/wp-content/uploads/2012/01/shoes-for-flat-feet.jpg">
<IMG title="shoes for flat feet" class="alignleft size-medium wp-image-18" alt="" src="https://web.archive.org/web/20120124023601im_/http://www.flatfeets.com/wp-content/uploads/2012/01/shoes-for-flat-feet-300x238.jpg">
</A>
</BODY>
</HTML>
And simple load this into webbrowser
webbrowser1.DocumentText = thehtml
I would just like to be able to detect whether or not the image has loaded properly. This should work for all images on the page.

You could create a separate WebClient request for each image in the html file and then see if any return a html response error code.
You would first have to parse the html and make a list of all the images urls. I would suggest using a package like HTML Agility Pack to easily parse out the image urls. Then you could use this code to identify any bad paths.
WebClient requester = new WebClient();
foreach (string url in urls)
{
try
{
Byte[] imageBytes = requester.DownloadData(url);
}
catch(Exception ex)
{
//Do something here to indicate that the image file doesn't exist or couldn't be downloaded
}
}
You can also convert the byte array to an Image and then make sure that it is RGB Encoded since that is the only encoding that can reliably be displayed in a web browser.

DXFilter is somehow still working in IE10?

This standalone example has a DXFilter to render a gradient, it renders in quirks mode. IE10 has 'show legacy filters' set to off, I see it in the 'internet' zone. I still see the gradient?
from: http://msdn.microsoft.com/en-us/library/ie/hh801215(v=vs.85).aspx
"DirectX-based Filters and Transitions (DX filters) are obsolete in Internet Explorer 10 for webpages in the Internet Zone. "
Why does this work?
<!-- Comment before Doctype to force quirks mode in IE6/7 -->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head><meta http-equiv="X-UA-COMPATIBLE" content="IE=5">
</head>
<style type="text/css" >
.SomeDiv
{
WIDTH: 50px;
HEIGHT: 50px;
FILTER: progid:DXImageTransform.Microsoft.Gradient(GradientType=1, StartColorStr='#00ff00', EndColorStr='#ff0000');
}
</style>
<div class='SomeDiv'>
Hi
</div>
</html>

Obsolete does not mean removed. In this case, there are two reasons:
The comment before the doctype triggers IE5 quirksmode
The site is running in the Intranet Zone or Trusted Sites Zone
If it is inconsistently appearing in the Internet Zone, there are two reasons:
End-users can change these settings (for these document modes only) by using Internet Options to change the security settings for the zone in question. Administrators can also use Group Policy.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Rendering of XHTML to PDF is slow on FlyingSaucer with OpenPDF library - flying-saucer

Related

Match html response in karate [duplicate]

Export Html to Pdf using JsReport in Asp.net core

About how to refer XHMTL file converted PDF file with CPF(Content Processing Framework) from browser

Detect broken images in webbrowser control document?

DXFilter is somehow still working in IE10?

Categories

Resources