PDF intermittently shows a grey box in Safari - pdf

There is an issue with Safari (on desktop) where PDF previews intermittently fail. I've observed this with files around 2MB, but not those under 1MB.
It seemed like a race condition, potentially in the code that generates the signed link however I've since narrowed it down to a Safari bug.
The symptoms are:
Intermittently failing Byte Range requests
Duplicate Byte Range requests (may or may not be an issue)
The preview shows empty, which essentially means a large grey box with the usual actions within the pdf viewer for downloading etc.

The solution to this was to use iframe instead of embed.
This does not appear to be an easy bug to search the solution for, and I've not a clue why it is a problem in the first place, however there is a documented report over on https://github.com/pipwerks/PDFObject/issues/243 which is where I found the solution.
I've also written a short blog post about it: https://shanehudson.net/articles/2022/pdf-breaking-safari/

Related

How do I fix garbled text in my react-pdf viewer?

I have created a pdf viewer using react-pdf. When I display certain pdfs, the text is choppy and unreadable. I have tried zooming in and out of the document and it is choppy in different ways at different scales. Sometimes the text even looks okay at a certain scale after zooming out and then zooming back in.
(Sample at 1.5 scale)
(Sample at 1.6 scale)
At first, I thought it might be an issue with react-pdf, but I saw that react-pdf is basically a wrapper around PDF.js. I found that I can replicate the issue in the PDF.js demo page.
Unfortunately, I'm working with a pdf that contains identifying information, so I can't share the full pdf or full screenshot. I'll include as much as I can figure out to share.
What I have tried
My initial thought was that maybe the component was rendering small initially and then had issues scaling up. So I made the initial size really large, but that didn't fix it.
I made sure that standard fonts were included following the instructions on the react-pdf home page
I tried using pdf repair tools online to maybe fix the pdf itself. That didn't help.
I tried changing the renderMode to 'svg' as detailed in the Document api documentation. This was the most helpful fix, as it does render the text correctly, but it then makes it so the images on the pdf don't load.
Thanks for your help/suggestions.
If I can find a way to edit the pdf to not have sensitive information, I'll try to find a place to make it available for testing. I apologize that I cannot provide that at this time. I know it's difficult to give advice when you can't replicate it yourself. I'll work on that.
From a programming point of view there is only "Providing a standardFontDataUrl and disabling the font face" (see later), however it affects many pdf.js based code developers outputs, thus I consider as still "OnTopic"
This issue is still open in react-pdf, though I have seen it mentioned by other pdf.js users since mid year (MS or Chrome update ?) , so unsure if it is not a wider fail affecting Mozilla PDF.js code users.
https://github.com/wojtekmaj/react-pdf/issues/1010
https://github.com/wojtekmaj/react-pdf/issues/1025
There semes to be earlier reports back in Early March and then later suggestions to change win 10 drivers. However also reported by win 11 Pro users. PDF.js versions from 2.8.335 to 2.14.305, and it doesn't affect version 2.7.570. so partially down to updated versions ! But seen only in Chromium.
It is entirely possible that we started doing something that trips Chrome,
The symptoms seem to be hardware or settings orientated since it is reportedly seen on some identical groups of users but not affecting others.
toggling back and forth between single page and multi-page views the issue resolves. It also seems dependent on the resolution or appears on some machines and not others so it is a little tricky to repro.
I am not getting it personally, but a guy in my team get it.
Unclear which browsers are affected but looks like its a chromium / web kit rendering bug ?
Several browsers have been tested and only chrome faces this.
My colleague gets the same in Edge Version 101.0.1210.47 (Official build) (64-bit) and Brave (1.38.118 Chromium 101.0.4951.67) Will edit the issue
The suggested workaround is :-
Providing a standardFontDataUrl and disabling the font face fixes the issue.
if we disable Accelerated 2D canvas in chrome://flags then the preview appears nice and okay. But since this flag is on by default so user see the pixelated preview. Unless we ask them to turn off this flag.
Figured out that this only happens when hardware acceleration is enabled in your Chrome settings.
When its turned off the issue does not happen.
In address bar paste chrome://gpu or edge://gpu etc (its a long report of current onboard fixes) in my case (currently unconfirmed via reboot for my Edge) is showing Accelerated 2D canvas is unavailable: either disabled via blocklist or the command line. Disabled Features: 2d_canvas, thus I cannot see problems.
To change setting you can use
chrome://flags/#disable-accelerated-2d-canvas
but its a manual choice between options.
so on reboot I see
Graphics Feature Status
Canvas: Hardware accelerated
Canvas out-of-process rasterization: Disabled
but have little problem with the domo (except normal fuzzy text as pixels) so either Edge update or my hardware is not visibly affected or my default settings are reasonable.
This issue has been finally fixed in the latest version of react-pdf library. Check here: https://github.com/wojtekmaj/react-pdf/releases/tag/v6.2.2
I also faced the same error and I fixed it by setting render mode to canvas (earlier it was SVG) and scale value to more than 1. Try scale = 1.5

How can I force an embeded PDF to display at "Page Width" zoom level?

I'm using embed to embed a pdf into my web page. I could also use object or iframe. I want the pdf to open full width of the page, which i have working. But I want the zoom level to be "Page Width". Ive seen several solutions for height, like #view=FitH and some for specific percentages, but not "Page Width"
Most PDF settings are browser specific and subject to user control so here a Chrome based Edge (on the left) and a Firefox based clone (on the right) the biggest problem noticed in these four types of loading is the lack of object support (totally missing in bottom right). Note:- I have security set so only the high speed secured Firefox based plugin viewer will show me PDFs inline without risk of running a script.
However in direct answer to the question not all browsers support the acrobat#tags but most will attempt #zoom=50% (and #page= except perhaps Safari?) whereas fit has frequently been an odd behaviour difference between all of the camps. I show the chrome result below.
the correct tag could be #FitW and that's what I used when chrome unsecured viewing is turned on

pdfbox embedding subset font for annotations - part 2

I am creating a separate question, stemming from this one. The used code is almost the same. The reason is that the original problem was about subsetting a font with pdfbox, which I kind of dealt with. I got faced though with another problem, which is : the annotations, and how the fonts used in them are interpreted by particularly Acrobat Reader DC.
I tried different combinations of fonts and embedding options and got rather desperate. The fact is that I had a feeling that in particular the way these things are handled by the programs that interpret the PDF files is non-standard. I think I read somewhere that the annotations and the way they are displayed is on purpose non-standardized by the PDF format, to give freedom to the interpreters to handle them in their own way, since the main purpose of the annotations is the interaction with the user. TL;DR I cannot understand why Acrobat Reader DC doesn't like the annotations I have created and saved with PDFBOX. I even opened a question on friendly and helpful Adobe's User Community forum. But as I expected, someone suggested me to better investigate this question with the PDFBOX team.
Everything is possible, but rather than writing a question on PDFBOX mailing list (I could never get used or understand the efficient use of the mailing lists btw), I want to open a question here because I hope that it could help others to understand the PDF format better.
I basically rephrase the above question from the Adobe's forums here: Here is an example (Google Drive link) with FreeText annotations (but it seems to make no difference if I use Stamp annotations instead), it causes problems when open by Adobe Acrobat Reader DC (file) version 21.001.20149.37945 (I think this corresponds to April 16th '21 update). Specifically the problem happens when the Comments pane is opened by the user, either manually or automatically.
Manually:
link
Automatically:
link
While experimenting, I also tried to unset the "Use local fonts" option in Preferences -> Page Display. I had the impression that maybe Acrobat Reader will be more eager to show the error message once it is not allowed to substitute the erroneously embedded fonts with the possible local fonts. I am not sure if this is true.
The error that I get is the infamous "Cannot extract the embedded font XXXXXX+SomeFontName" as seen in the below picture:
link
The same problems happen also if I use full font embed (subsetting option set to false when using PDType0Font.load). I also tried to embed OpenSans font instead of LiberationSans, also tried to manually convert LiberationSans to a TTF font with fewer glyphs using FontForge, even tried to use Windows ARIALN.TTF, thinking that maybe the font is the problem. All cause the same behavior in Acrobat Reader DC. I have also tried to run Acrobat Reader 2019 Pro Preflight on the document and in the profile that scans the document for the possible font inconsistencies, it reports no errors.
Of course, when I use e.g. PDType1Font.HELVETICA instead of custom TTF font, I do not get the above errors. But I cannot use it because it does not contain the glyphs for the Unicode characters that I use. Does anybody have a better idea?
Thank you very much!
EDIT: to make myself clear - the error does not appear ALWAYS. it appears on some machines constantly (e.g. I am using Windows 7 64-bit with latest Acrobat Reader DC installed to reproduce it fairly well), while on my Windows 10 64-bit with the same version of Acrobat Reader DC it sometimes appears, and sometimes not - I haven't figured out why or in what cases.. - which makes me think - but no - I checked that too - the font I am using opens up alright on the machine where the problem is fairly constant)
UPDATE: at my wits ends again, I created a blank page with Apache OpenOffice, exported it to PDF, opened it with Acrobat Reader DC (last version), added a FreeTextTypewriter annotation (View -> Tools -> Comment -> Open) with 4 greek letters in ArialNarrow font, saved it, reopened it with Acrobat Reader DC, and it gives me the same error (cannot extract the embedded font...).. So this could be the Reader problem? But they made this so difficult to diagnose.. Here is the file, but I do not expect it to show errors on other machines. It's one of those moments that you start to believe in magic and the power of prayer (and a good sleep)
UPDATE 30/04/2021
So, to sum things up, I haven't come with a solution yet, but I came up with three files created with PDFBOX, OpenPDF (iText5 fork) and Acrobat Reader DC itself (can append annotations and save - just adding a simple Text box with greek text through Comment pane) - and they all issue the above error message, when open by Acrobat Reader DC. I have posted details in the Acrboat Reader forum here (same link as in comment)
I have added the code that I used to create the OpenPDF example file here and the example 3 files are in the same repository here

Putting an iframe overlaid on a pdf document in a browser extension

I created a browser extension that lets you look up words in Wikipedia or Wiktionary without needing to open a new tab ( https://addons.mozilla.org/en-US/firefox/addon/in-page-lookup/ , almost done porting to Chrome). It is very useful when you are doing research and come across a word you don't know or want to know more about. The only thing is, a lot of research content is in PDF format. A long time ago (~2013ish) I had an older version of the app based on the old Firefox add-on framework and that did let iframes show up over pdf documents but this has not been the case for many years. I don't think the extension is even recognized in pdf documents, I get "Error: Could not establish connection. Receiving end does not exist" and there is no extension content script on the pdf page. So, my question is, is it possible to put an iframe over a pdf document? Do I need to work on the background side, and if so, how? Thanks.

characters missing when printing

We have a WPF application which can perform either a report preview or a report print.
Both requests use the same code.
Call the report service which gets the report from Microsoft Report Services.
Convert the report into the desired format (in this case PDF).
Then return the report as a byte array.
The result is then written to a temporary file as a binary stream, and either popped into a window to preview or start a Process to print.
In both cases the temporary file is passed.
Print Preview works flawlessly! But Print Report will print with all occurances of 'ti' disappearing. I see there is a printer escape sequence of ESC t NUL/SOH and I assume that if, for some reason, an escape character gets into that stream that ti will result in an ignored print sequence. Thus the missing characters.
My first question is if anyone has ever experienced this with generated PDF reports?
My second question (obviously) is if anyone knows of a utility I can use to view the binary data in the file being printed, to see what is in the file just before every 'ti' sequence?
After a great deal of searching I came across a post on the Adobe forum that states that version 8 had a bug where it was not printing character combinations. Once I dug deeper it seems that it has returned and the suggested workaround fixed our issue.
Workaround: Do a print as image.
Adobe seems to be unable to do the most basic of what their software must do, print the exact content!
Answer for your second question:
First, do one of the following two things:
Set the Windows print spooler properties to not delete printed jobs.
Pause the target print queue.
Then, grab the spool file from the Windows printspool directory (which location that is you can find out by looking at the (right-click) 'Properties...' dialog of the 'Printers and Faxes' folder).
I realize this is an old post but I wanted to add some updated info from the above comment stating that it's a problem with Acrobat 8. We are using Acrobat 10.1.6 and still have the same problem. From what I've read, it's a problem with the adobe product itself. The only real fix I've seen (actually work around) is to print as an image. LAME
Surprisingly this bug is still there in 2021. Adobe cannot be relied upon printing documents properly. This takes away all the allure of features it had if it cannot do the most basic stuff it is required for.
Printing as image reduces the quality and blur the document.
Simply open the document with Safari or Chrome and print from there. E
I had a similar problem while printing directly from the firefox (acrobat reader within). I downloaded the file and then printed. The problem was solved.