Font hinting behavior in PhantomJS? - pdf

I need to render text contained within divs on HTML document into PDF. I'm thinking of using PhantomJS, but one thing is very important. Different browsers and platforms render the text differently. So if I have the following code:
<div style='width:150px; height:80px; position:absolute; top:130px; left:78px'>
<p> Some text, yayy! :) </p>
</div>
It may render on one browser like this:
Some text, yayy!:)
But on another like this:
Some text,
yayy!:)
What happened was that (because of font-hinting, I guess), the text in the first example ended up with a certain width that fit into the containing div, but because of the font rendering on the second browser, the text ended up taking just a little more space that didn't fit in the container, and had to wrap around to the next line. I can't afford this kind of unreliability on how the output turns out. If the HTML had it on one line, I need the PDF to have on the same line too.
I've actually asked a related question here: Make fonts on Windows render like Mac/Linux: disabling font-hinting and/or deal with anti-alias on client side with no luck, but it was basically in trying to solve this same issue.
Can PhantomJS do anything about this? Or can PhantomJS at least somehow calculate the true width of a text, without font hinting and any other things involved? Or maybe calculate what it might render to if hinting were included? Or anything, as long as things come out on the PDF as they look on HTML. (Given the application I'm working on, I do not have the freedom change the CSS style of the containing DIV).

Font hinting is almost certainly not what is changing the width of text here. Font hinting involves making small adjustments to line up edges in an outline font to screen pixels; the adjustments are made within each character and should not change the overall width of that character.
Across platforms, there are slightly different versions of a font because of licensing issues. macOS and GNU/Linux can't usually go out and copy Microsoft's fonts exactly, for legal reasons, so the nearest you'll get is a font that basically looks the same (and has a similar name) but isn't really the same font. So some width variations across platforms are to be expected, unless you can provide your own font files along with the page (web fonts).
PhantomJS uses the system's fonts just like any other browser. So using PhantomJS will not automatically give you some "cross-platform" set of fonts that's different from your system fonts.
If you need 100% reproducibility then I suggest creating a virtual machine (or Docker image) with a standard set of fonts installed, and use that everywhere. Just don't forget to apply security patches to it when needed.

Related

Matplot LaTeX Rendering in cmss style

is it possible to render text as well as math symbols in the cmss (Computer Modern Sans-Serif) style within Matplotlib? I managed to render all text in cmss style but math symbols were still in the normal serif style. I then tried to use the cmbright package within the loaded preamble which admittedly resulted in everything (text + math) having a sans-serif style but the new style wasn't exactly cmss but a slightly lighter version of it. I would like to know if there is a way of rendering everything in the cmss style as this is the font for headings in the document I'm currently working on.
Thanks in advance!

How to render mathematical on PDF documents

I am trying to render mathematical expressions to a PDF document using a low-level library such as libharu or pdflib; I am having problem rendering several special signs, such as radical signs.
A solution I can think of is to use draw line methods to render the radical sign. This is straightforward. But this doesn't work for integrate or other signs such as "\left(". I consulted KaTeX source code and I found it uses a built in svg graphics, which is not applicable when rendering a PDF document
(Maybe I was not making it clear: the svg presented in KaTeX was actually a glyph pre-rendered, while my application scenario is to render PDF directly with couple of given parameters, such as position and height, etc. ).
Are there anyone who know the PDF rendering mechanism of LaTeX or similarly stuffs that can help me out? thank you in advance!

Apache FOP - Scrolling in PDF possible?

I'm using Apache FOP to generate a PDF through XML and XSL-FO. I have a cell in my generated PDF that I need to be able to scroll through if the content overflows it. XSL-FO has an overflow="scroll" feature, but based on my research on the topic it seems that Apache FOP does not support this option.
For example, here is a scrollable region in a PDF used by a large CAD company that I need to replicate:
Is there any way to enable this feature in Apache FOP? Is it possible to enable it in the source code (I haven't been able to find a way to do so)? Any other ways to tackle this issue?
No, it isn't possible.
From the FO perspective:
In the XSL-FO Recommendation the scroll value for the property overflow comes from the corresponding CSS2 definition, which includes this clarification:
When this value is specified and the target medium is "print", overflowing content should be printed.
As the PDF output is a print-oriented medium, I read this as a confirmation that FOP is correct in printing the overflowing content.
From the PDF perspective:
In the PDF Reference 6th edition, a search for the word "scroll" returns results referring either to the scrolling bars in the user interface or in interactive forms (text fields, list boxes, combo boxes).
There is not, or at least I could not find it, a "static text object, but with scrolling bars" feature (which is probably sensible for a print-oriented format), so FOP cannot create it in the PDF output file, not even modifying the source code.
A second look at your comment and the screenshot you included made me think it could be an example of the 3D Artwork feature of the PDF format, a feature I didn't know of before (and I still know nothing besides its name). According to the reference:
Specific views of 3D artwork can be specified, including a default view that is displayed initially and other views that can be selected. Views can have names that can be presented in a user interface.
So, I think your screenshot shows the different views associated to a 3D object; it is not a general-purpose feature that could be used to provide scrollable text.
Well, it could be possible ...
It is possible but as far as I know not with Apache FOP. Without seeing the PDF in question and guessing from the screen shot, it looks like a Flash widget inserted into the PDF. This in PDF terms is a RichMedia annotation (requires PDF version 1.7 with extensions) in which you can insert the Flash widget as well as other controlling files (like XML, other images to display, etc.) and relate them together.
AFAIK, only RenderX XEP (whom I work for) supports such RichMedia annotations inserted into PDF via XSL FO through the rx:rich-media-object extension documented here: http://www.renderx.com/reference.html#Rich Media
I believe, the only viewer that supports PDF with RichMedia annotations is Adobe Reader so it is required to view such a file. Here is a sample that includes a few interactive flash widgets, some interactive charts all within a few page PDF that was generated long ago. NOTE: I am sure some of the links in the document do not go anywhere, it was for a trade show many years ago. Remember, you would need to download this file and view in Adobe Reader and have flash player installed to see it function.
http://www.cloudformatter.com/Resources/Samples/RichMedia.pdf
You cannot use common PDF browser-based viewers like Chrome or Firefox as they do not support this type of annotation.
A screenshot of page one here shows an interactive, scrolling widget. Page 4 contains a widget similar to what you show in your example.
Page 4 scrolling widget very similar to your request:
The widget on the last page is created using a scroller SWF that takes parameters that are the images and setup/configuration files that are XML. The RenderX extension object takes these as parameters and embeds all of them in the document for the interactive flash widget so that it is totally self-contai9ned in the PDF. The XSL FO to do this is:
<rx:rich-media-object name="Sample HTML Widget" scaling="non-uniform" width="611.92pt"
height="74.99pt" content-width="scale-to-fit" src="url('rx-scroller\dockmenu.swf')"
transparency="true" activate-condition="page_visible">
<rx:flash-var name="setupXML" value="rx-dock-settings.xml"/>
<rx:flash-var name="contentXML" value="rx-dock-contents.xml"/>
<rx:rich-media-resource name="rx-dock-settings.xml"
src="url('rx-scroller\rx-dock-settings.xml')"/>
<rx:rich-media-resource name="rx-dock-contents.xml"
src="url('rx-scroller\rx-dock-contents.xml')"/>
<rx:rich-media-resource name="style.css" src="url('rx-scroller\css\style.css')"/>
<rx:rich-media-resource name="customer1.png" src="url('rx-scroller\images\customer1.png')"/>
<rx:rich-media-resource name="customer2.png" src="url('rx-scroller\images\customer2.png')"/>
<rx:rich-media-resource name="customer3.png" src="url('rx-scroller\images\customer3.png')"/>
<rx:rich-media-resource name="customer4.png" src="url('rx-scroller\images\customer4.png')"/>
<rx:rich-media-resource name="customer5.png" src="url('rx-scroller\images\customer5.png')"/>
<rx:rich-media-resource name="customer6.png" src="url('rx-scroller\images\customer6.png')"/>
</rx:rich-media-object>
And note that many things that are in the flash would work, like links and such. It is just a pure, interactive flash inserted into PDF as the container.
Indeed it looks like this is not possible to achieve through FOP.
Continuing to dig around for a few days, however, I did find a clever post-processing alternative that is also free, essentially embedding a PDF inside of another PDF using the LaTeX animate package.
A drawback to this method is that it is not possible to embed links inside of the scrollable region, which is a major issue for me. But the method does enable inserting a scrollable region inside of an existing PDF and got me very close to what I was trying to achieve.

webvtt position wrong when using css translate on parent (slider)

In my project I'm using swiper.js as a slideshow, each slide either contains and image or a html5 video with webvtt captions / subtitles.
On debugging, we noticed that the subtitle position is wrong (too low, cuts off screen) on webkit browsers.
After much debugging it turned out that this css3 rule on the parent div (the swiper-wrapper) makes the vtt position wrongly:
transform: translate3d(-1024px, 0px, 0px)
When you put the video in the first slide, all goes well, since there's no css translate yet.
This seems to be a core webkit issue: default webvtt positioning breaks when using css translation on a parent.
The workaround I found is to add a line positioning in the vtt itself to every subtitle element, like so:
WEBVTT
00:00:02.160 --> 00:00:06.440 line:90%
hello world
00:00:06.560 --> 00:00:11.920 line:90%
testing subtitles
Any sentence without the "line: 90%" part is rendered partly offscreen. It seems this setting forces the webvtt parser / renderer to set itself to the correct position.
QUESTION: did anyone encounter this issue yet and is there any other (easier) workaround for this bug? Adding the "line:" part to all subtitles would be a hell of a job.. unless there's a good editor that can do that stuff in batch.
QUESTION 2: Since this seems to be a webkit vtt parser bug, anyone know where to best report this?
Test setup here: http://orgonemedia.nl/webvtt-bug/
I'm currently debugging some WebVTT files for English captions and other languages too. I'm experiencing a similar problem, although I can't say what is exactly causing it. I'm going to try the line:90% fix you've suggested here.
ANSWER TO YOUR QUESTION 1: Regarding the job of adding it to all the subtitles, you'll be happy to know that's actually pretty easy with the right tool. I use Sublime Text Editor. The way I would do it is use "Find all" to find all the occurrences of -->, then simultaneously edit each of those lines, using the arrow key to navigate across to the right place on the line (since each subtitle out-time is the same number of characters, 12), then type in line:90%
UPDATE:
So I implemented your suggestion, using the method I outlined, and it successfully repositioned my captions.
More details: I was only experiencing the problem of captions being half off the bottom of the video when viewing on an iPad. Oddly enough, viewing the same page on an iPhone, they were positioned correctly without any change. The 90% change still adjusted it up though.
Intriguingly the line:90% code does nothing to adjust caption position when viewing the page on Chrome.
I'm having trouble getting much at all to display on Safari desktop. I think there's something invalid about my file format, but I'm darned if I can find it.
When editing the captions through my video hosting service's caption editor (I'm using JWPlayer), the timecodes show up as being invalid:
Image showing caption editor with invalid warning

pdf2svg without size specification

I want to convert pdf to svg using pdf2svg without the width and height specification that is automatically added (for the purpose to make it fit to the container along the lines of what is mentioned here). I couldn't find any option on pdf2svg to do this. What is the most realistic reliable way to do this? If scripting is necessary, I would use jQuery and/or Ruby.