wkhtmltopdf does not apply table-header-group css style - webkit

I'm developing an application that converts a web page containing a 1000 elements table to pdf.
I'm trying the wkhtmltopdf version 0.11.0 rc2, the resulting pdf looks great but... the column headers are not repeated on the pages and are visible only on the first page.
This seems to be a problem affecting also chromium engine, see http://code.google.com/p/chromium/issues/detail?id=24826
Does anyone have a solution for this problem?

Related

TinyMCE 5 - large images pasted via Safari do not render correctly

We are running TinyMCE version 5.4.1 with various options including:
paste_data_images: true
powerpaste_allow_local_image: true
When we drag & drop (or paste) in smaller images (400px X 400px) everything seems to work fine. The Base64 encoding is saved to the database and the image is rendered from all browsers, Chrome, Firefox and Safari.
However, when we paste in a larger image (1920px x 1081px) the image is only saved and rendered correctly in Chrome and Firefox. In Safari the Base64 encoding is saved with all lowercase characters. Therefore it doesn't render when attempting to view it. Has anyone else experienced this?
I have searched here as well as on the TinyMCE website but don't see anything mentioning this behavior. We will eventually attempt to move away from this Base64 implementation as it's no longer recommended but it's what we have for the time being so I'm just trying to address this issue.
When the page loads, its' elements can do so in parallel. But when the browser sees the base64 image, it blocks the page from loading until this image is rendered. Thus, inserting large images into the page as base64 is certainly not a good practice - it may slow down page loads and worsen the UX.
To fix this problem and maybe several other issues, utilizing the automatic_uploads option is highly recommended. It will upload pasted images on the server instead of converting them to base64. Here is the example of the PHP upload handler that will upload images and give their URLs back to TinyMCE.
Concerning the issue with Safari, some minimal reproducible example would be very useful.
I should also mention that PowerPaste is a premium feature that will not work with TinyMCE opensource. If you are using the paid version of TinyMCE, you can create a support ticket.

Apache FOP - Scrolling in PDF possible?

I'm using Apache FOP to generate a PDF through XML and XSL-FO. I have a cell in my generated PDF that I need to be able to scroll through if the content overflows it. XSL-FO has an overflow="scroll" feature, but based on my research on the topic it seems that Apache FOP does not support this option.
For example, here is a scrollable region in a PDF used by a large CAD company that I need to replicate:
Is there any way to enable this feature in Apache FOP? Is it possible to enable it in the source code (I haven't been able to find a way to do so)? Any other ways to tackle this issue?
No, it isn't possible.
From the FO perspective:
In the XSL-FO Recommendation the scroll value for the property overflow comes from the corresponding CSS2 definition, which includes this clarification:
When this value is specified and the target medium is "print", overflowing content should be printed.
As the PDF output is a print-oriented medium, I read this as a confirmation that FOP is correct in printing the overflowing content.
From the PDF perspective:
In the PDF Reference 6th edition, a search for the word "scroll" returns results referring either to the scrolling bars in the user interface or in interactive forms (text fields, list boxes, combo boxes).
There is not, or at least I could not find it, a "static text object, but with scrolling bars" feature (which is probably sensible for a print-oriented format), so FOP cannot create it in the PDF output file, not even modifying the source code.
A second look at your comment and the screenshot you included made me think it could be an example of the 3D Artwork feature of the PDF format, a feature I didn't know of before (and I still know nothing besides its name). According to the reference:
Specific views of 3D artwork can be specified, including a default view that is displayed initially and other views that can be selected. Views can have names that can be presented in a user interface.
So, I think your screenshot shows the different views associated to a 3D object; it is not a general-purpose feature that could be used to provide scrollable text.
Well, it could be possible ...
It is possible but as far as I know not with Apache FOP. Without seeing the PDF in question and guessing from the screen shot, it looks like a Flash widget inserted into the PDF. This in PDF terms is a RichMedia annotation (requires PDF version 1.7 with extensions) in which you can insert the Flash widget as well as other controlling files (like XML, other images to display, etc.) and relate them together.
AFAIK, only RenderX XEP (whom I work for) supports such RichMedia annotations inserted into PDF via XSL FO through the rx:rich-media-object extension documented here: http://www.renderx.com/reference.html#Rich Media
I believe, the only viewer that supports PDF with RichMedia annotations is Adobe Reader so it is required to view such a file. Here is a sample that includes a few interactive flash widgets, some interactive charts all within a few page PDF that was generated long ago. NOTE: I am sure some of the links in the document do not go anywhere, it was for a trade show many years ago. Remember, you would need to download this file and view in Adobe Reader and have flash player installed to see it function.
http://www.cloudformatter.com/Resources/Samples/RichMedia.pdf
You cannot use common PDF browser-based viewers like Chrome or Firefox as they do not support this type of annotation.
A screenshot of page one here shows an interactive, scrolling widget. Page 4 contains a widget similar to what you show in your example.
Page 4 scrolling widget very similar to your request:
The widget on the last page is created using a scroller SWF that takes parameters that are the images and setup/configuration files that are XML. The RenderX extension object takes these as parameters and embeds all of them in the document for the interactive flash widget so that it is totally self-contai9ned in the PDF. The XSL FO to do this is:
<rx:rich-media-object name="Sample HTML Widget" scaling="non-uniform" width="611.92pt"
height="74.99pt" content-width="scale-to-fit" src="url('rx-scroller\dockmenu.swf')"
transparency="true" activate-condition="page_visible">
<rx:flash-var name="setupXML" value="rx-dock-settings.xml"/>
<rx:flash-var name="contentXML" value="rx-dock-contents.xml"/>
<rx:rich-media-resource name="rx-dock-settings.xml"
src="url('rx-scroller\rx-dock-settings.xml')"/>
<rx:rich-media-resource name="rx-dock-contents.xml"
src="url('rx-scroller\rx-dock-contents.xml')"/>
<rx:rich-media-resource name="style.css" src="url('rx-scroller\css\style.css')"/>
<rx:rich-media-resource name="customer1.png" src="url('rx-scroller\images\customer1.png')"/>
<rx:rich-media-resource name="customer2.png" src="url('rx-scroller\images\customer2.png')"/>
<rx:rich-media-resource name="customer3.png" src="url('rx-scroller\images\customer3.png')"/>
<rx:rich-media-resource name="customer4.png" src="url('rx-scroller\images\customer4.png')"/>
<rx:rich-media-resource name="customer5.png" src="url('rx-scroller\images\customer5.png')"/>
<rx:rich-media-resource name="customer6.png" src="url('rx-scroller\images\customer6.png')"/>
</rx:rich-media-object>
And note that many things that are in the flash would work, like links and such. It is just a pure, interactive flash inserted into PDF as the container.
Indeed it looks like this is not possible to achieve through FOP.
Continuing to dig around for a few days, however, I did find a clever post-processing alternative that is also free, essentially embedding a PDF inside of another PDF using the LaTeX animate package.
A drawback to this method is that it is not possible to embed links inside of the scrollable region, which is a major issue for me. But the method does enable inserting a scrollable region inside of an existing PDF and got me very close to what I was trying to achieve.

MediaWiki API, mobileview

I am trying to get a mobile suitable version of the page with the query http://en.wikipedia.org/w/api.php?mobileformat=&noimages=&format=json&action=parse&prop=text&page=Snow. I expected that the page content would be like on the mobile version of wikipedia, but I am still receiving the desktop version with wide tables etc. However the result contains no images.
Width is controlled by style sheets which you'll have to add yourself. There are no images because you specified the no images parameter.

ABCPdf 8.1 webpage to PDF image compression issue

I've just recently upgraded from ABCPdf 6 to ABCPdf 8.1.
I've got this webpage that I need to convert into PDF format.
The whole reason for the upgrade is because ABCPdf 6 cannot read Chinese characters while version 8.1 could.
I did a test of the page on the ABCPdf demo site (http://www.abcpdfeditor.com/) and things looks fine. (1.2Mb pdf)
However, when I tried to either:
run my own backend codeor run the code provided in the examples (C:\Program Files\WebSupergoo\ABCpdf .NET 8.1 x64\ExampleSite) - Identical to www.abcpdfeditor.com
All PDF generated produces really pixelated/compressed images! (121Kb)
The page that I'm trying to convert is:
http://debug.webpromos.com.au/certificate.htm
As far as I can tell, there is no issue with the code, but I've attached it below anyway...
http://debug.webpromos.com.au/html2pdf.ashx.txt
I'm stuck and can't think of anything else that could make it generate a 1.2Mb PDF document with non-pixelated image as produced by www.abcpdfeditor.com!
I hope I have provided enough information!
Thanks!
Wilfrid
Solved, all I need to do is to select the html engine specificly.
theDoc.HtmlOptions.Engine = EngineType.Gecko;

Jekyll documentation to PDF with TOC

I would like to write documentation using Jekyll with HTML and PDF outputs. Html can have a navigation but the PDF should have table of contents. Is there a free and easy way to do that?
The HTML part is easy but I would like to use #media print CSS for making the PDF file.
I have a few ideas how to do this.
Use PrinceXML, unfortunately this is commercial product with a nasty price tag ~$500
Use WKHTMLTOPDF
Use Maruku, since it is possible to do a PDF conversion using it
I would like to have multiple pages HTML and single page PDF with a TOC. Any suggestions?
Btw. Buildr has solved this problem using PrinceXML.
If 'free' is your most important criterion, than wkhtmltopdf is your best bet. It supports things like covers, toc, headers, footers and sections. Depending on how exotic the layout of your document will be, you most likely will run into some page-break issues, but with a bit of tinkering you should be fine.
I've been using wkhtmltopdf for a bit now, with some quite complicated documents (with javascript charts, tables, svg images, etc.) and have not run into too many issues.
Make sure you use the static version of wkhtmltopdf, as it is the only version which supports rendering of a TOC page.
You can use the PDFKit gem, which uses wkhtmltopdf behind the scenes. Then you can put your PDF logic in a Jekyll plugin as a generator or converter.
For generating a table of contents using Jekyll, you can use the {:toc} macro offered by markdown, or write your own textile table of contents filter if you prefer to use textile..
For generating a PDF from Html and CSS, I have found weasyprint to be a good solution. Since they do not rely upon an external engine for rendering, they do not depend upon foreign project's roadmaps for implementing relevant features such as CSS generated content or #page CSS-declarations. (But in contrast to wkhtmltopdf, weasyprint does not parse javascript).
You could also use a browser extension called Awesome Screenshot to create JPEG/PDF from a page. The extension allows you to create a full-page image or export it to PDF. With this tool, you can export all pages really quickly (and/or later combine all PDFs together to create a single document).
I am aware this is a quick & dirty solution (not perfect). E.g: while using images instead of text, the full-text search will not work. Additionally, it may require some manual work, but it does the job when you just want to read it.