Disable subpixel antialiasing of text in PDFs - pdf

I'm generating PDFs. One font in particular - Helvetica Neue Light - renders particularly poorly when the PDF is viewed on a Mac, thanks to subpixel antialiasing.
I want to disable subpixel antialiasing in the PDF. (Not in the PDF viewer.) The equivalent in CSS is -webkit-font-smoothing: antialiased.
Bad:
Good:
Is this possible?

No.
Somewhat longer answer? :)
Antialiasing is a device specific optimisation (anti-aliasing would be done differently on a printing device (if at all), on a monitor and on an LCD monitor for example).
PDF is by design a device agnostic file format and as such it stays away (tries to stay away) from device specific constructs. Anti-aliasing settings are by definition preferences from the viewer and cannot be influenced by the PDF file itself.
The result you're showing is quite different indeed - you're sure the font is accurately embedded in the PDF file (and is of decent quality)? It would also be worthwhile to look a bit further at the "On a Mac" statement. I'm assuming you're saying it looks bad in "Preview" on a Mac; in that case it's be interesting to see how it performs in Preview versus Adobe Acrobat. That could tell you whether it's a font issue or an application issue.

Related

a background grid appears after using ps2pdf

Morning, everyone,
Quick question about PS2PDF. I use it to convert graphics that I produce directly in postscript to PDF. While there is no visual problem on PS files, I see a grid on my PDF viewer. At first I thought the problem was in the viewer, but it remains present when I compile my TeX files containing the figures with PDFLaTeX. Do you have any ideas for settings that can "fix" this display? Thanks in advance :)
Evince is independent of Ghostscript as far as PDF files are concerned, but I don't know how it can be viewing PostScript files.
I believe what you are seeing is an artefact of the PDF rendering engine in use, and the way the PDF file is constructed (which is itself dependent on the way the PostScript is constructed).
Much of the content is drawn by creating little rectangles which are intended to butt up against each other (and basically do). However, depending on the resolution, the precise numerical accuracy of the calculations and the accuracy of the co-ordinates, it can be the case that these rectangles do not quite touch ideally. There is a theoretical gap between them.
You can see this occur with Adobe Acrobat, and zooming in and out changes where the lines appear (it changes the effective resolution, thereby changing the calculations from user space to device space, ie to the actual pixels on screen).
I cannot say for sure that the same problem exists with Evince, but I expect it does. Withh Acrobat I can turn off anti-aliasing, which is where the problem really arises. Acrobat is attempting to insert an anti-aliased pixel between the two rectangles, which leads to these faint lines. Turning it off (In Acrobat X Edit->Preferences->PageDisplay->Smooth Line art) makes the lines disappear.
Ghostscript doesn't apply anti-aliasing by default, so these lines don't appear when rendering either the PostScript or the PDF files, but if I turn on anti-aliasing (-dGraphicsAlphaBits=4) then Ghostscript renders the lines in both the PostScript and the PDF file.
Essentially I think the problem is that your PDF viewer is using anti-aliasing and your PostScript viewer isn't, so they don't look the same.

Issue with ghostscript rendering PPT into PDF

I've been tinkering with Ghostscript with a port monitor(on a HP PCL 6 Universal driver) to convert print job into PDF. I've tested with a few applications such as Words, Excel, Adobe Reader, Microsoft Edge etc and they are all working properly.
However upon testing Microsoft Powerpoint 2016, it seems like there are some graphics that are unable to be rendered properly through Ghostscript.
Actual Slide Below
Output From Ghostscript in PDF Below
I've tested this even with some other PDF generators such as BioPDF,CutePDF as well as AdobePDF and they would all result in the same output as above.
Just wondering has anyone tried and have faced similar issues before? if so could someone point me in the right direction??
What you are doing isn't a single step PowerPoint to PDF and Ghostscript is not rendering the PowerPoint. In fact if you are creating a PDF file Ghostscript isn't (ideally) rendering anything.
What's actually happening is that you are asking PowerPoint to print to a canvas, which is then passed to the PostScript printer driver. That produces PostScript which is sent to the Port. Your (and others) Port Monitor then sends the PostScript to the 'Distiller' (in your case Ghostscript and the pdfwrite device). The Distiller reformats the vector drawing commands into a PDF format and builds a PDF file from them. It doesn't render (turn into a bitmap image) anything unless forced to.
Obviously there are several places along that road where the problem could creep in. Given that you say that the Adobe product (the others you mention al use Ghostscript) has the same problem, I think its safe to assume that the problem isn't Ghostscript.
This also means that you aren't using the driver you think you are. Adobe can't handle PCL as an input medium as far as I'm aware, and nor can Ghostscript. GhostPCL will handle PCL as an input, but that's not what you say you are using.
Of course you haven't linked to an example file to demonstrate the problem, nor supplied an example command line, so this is all supposition.
Now if, somehow, you are using a PCL6 device, then the problem is most likely due to the presence of rasterOps in the output. Rasterops are part of the PCL imaging model which do not exist in PDF and are a form of transparency. There are three ways to handle such content for a PDF output device; firstly render the whole page content to an image, secondly ignore the rasterOps objects, thirdly treat the rasterOps as opaque.
GhostPCL and the pdfwrite device take the third option. So, its just conceivable that your original content has some transparent objects which are being handled as rasterOps by the PCL printer driver, and then rendered as opaque by GhostPCL and the pdfwrite device.
If that's somehow the case then the solution is simple; don't use a PCL printer driver, use the PostScript one.
If you post a link to a (simple, eg single page) example of what you are sending to Ghostscript, and a command line, then I can look at it. Please don't send me the PowerPoint, I can't use it and even if I could, my print setup would not match yours. I need the data being sent to Ghostscript.
[EDIT after looking at files]
Don't mean to sound like I'm lecturing, the problem is people find these result on Google searches and then try to apply them based on a poor understanding of what's happening. So I find it best to be really clear in my answers about what's going on. It saves questions later :-)
The first thing I see is that the PCL is indeed PCL, and if you try running that through Ghostscript it throws horrible errors and exits. So presumably you aren't doing that.
The PostScript file contains nothing except huge images, rendered (presumably at 600 dpi) contains 2 pages, the two pages look like your images above. Which is why the PostScript is better than 20 times larger than the PCL file.
But.... If I open the .ppt file with OpenOffice (4.0.0 is what I have to hand) I see exactly the same thing. I don't, I'm afraid, have a copy of Microsoft PowerPoint, but from what I see here there are two conclusions;
firstly that the PDF I get looks pretty much like the PowerPoint when viewed with OpenOffice at least. So there's something 'interesting' about your PowerPoint.
secondly, even if that's not what you expect, its what's in the PostScript program. That means that either PowerPoint rendered the slide to a bitmap or the Windows printing system/HP driver did.
Now, if I run the PCL through GhostPCL instead of Ghostscript (rendering, not producing a PDF) then the result is more like what I think you are expecting. However, when sent to a PDF file the result is horrible. Which strongly suggests to me that there's some form of transparency involved, PostScript doesn't support transparency at all, and PCL does it through rasterOPs.
I'm afraid that this means that the problem lies either in PowerPoint, the Windows print system or the PostScript printer driver you are using. Since the PCL is at least close to what you expect, I suspect that this means PowerPoint is doing the right thing, and its the printer driver messing up. It appears you are using the Windows PostScript printer driver.
So there's no way you can 'fix' this for files like this, at least not with Ghostscript. You would need to 'fix' the Windows PostScript printer driver, or possibly the Windows print system. You could try reporting a bug to Microsoft, presumably these files print incorrectly when sent to physical PostScript printers too.

PDF with OCR text visible, how to hide it from existing PDF

I have several PDF files that have been OCR-processed (not by me). They contain both the scanned image and the OCR text. They seem to work fine in some viewers (iPhone/iPad), but not in others (Preview.app on macOS) which makes them somewhat awkward to read.
From googling around, it seems that the text & image may be layered incorrectly or there is a problem with the fonts used? I'm not even sure I'm using the correct vocabulary, as most hits I get are worthless.
Is it possible to use ghostscript or something to batch-fix these files?
Example of "bad" rendering:
Its impossible to say what's wrong with the PDF file (or viewer) without seeing the PDF file, which alse makes it hard to propose solutions!
You could certainly run the file through Ghostscript to the pdfwrite device, and use the -dFILTERTEXT switch to not process the text. The resulting document would therefore not contain the offending text, but would still contain the image.
Of course, this would then not be possible to search or highlight.
You could instead use -dFILTERIMAGE which would remove the original image leaving the text behind. But then anything in the original document which was not text would now be missing.
The usual 'best practice' is to have the text drawn in rendering mode 3, which makes no marks. This allows you to see the original image without the OCR'ed text interfering. Its possible that the viewer you are using is not honouring the text rendering mode, which would be a (fairly serious) bug in the viewer. The most recent versions of MacOS seems to have some nasty bugs in the Quartz PDF rendering engine.
The other way to do this is to draw the text first, then put the original image on top of it, but that's hard to get wrong, I suspect its more likely the text rendering mode.
EDIT
The PDF file first draws the text, then draws the image on top of the text. The underlying text should not appear. mkl is quite correct in his comment.
The correct way to fix this is to fix the consumer which is rendering it incorrectly. As I mentioned above the latest version of Quartz seems to have some fairly serious bugs, you might choose to raise this as a bug with Apple.
The only other solution would be to run this through something which will remove the text. Ghostscript can do this but there are implications; firstly it will no longer be possible to search/copy/paste text from the document. Secondly you would need to run quite a complex command line in order to prevent the decompressed JPX images being recompressed as JPEG, which would probably result in compromised quality. Finally the resulting file size would be larger.

PDF Creator - Difference in the PDF Quality vs Adobe Acrobat. How can I change it?

When I am using PDF Creator to create PDF documents the quality of the fonts is not exactly the same as when I am using Adobe Acrobat to create the same PDF. The fonts when creating with pdf creator are a bit more fussy (not as crispy as with Adobe).
Does anyone know if/how I can resolve this?
Here are 2 example documents that demonstrate what I mean:
Example of PDF created with PDF Creator
Example of PDF created with Adobe Acrobat
I don't have a solution for you unfortunately but I can tell you that what you are seeing is anti-aliasing. If anti-aliasing is enabled, fonts at lower resolutions will get that "fuzziness" that some people believe helps with reading. It might not look as pretty but it improves word recognition (so the theory goes). But that's beside the point. What you need to do is look for a setting to disable anti-aliasing. If you can't find it then you might have to look into setting actual Ghostscript settings, possibly dTextAlphaBits but I'm not a Ghostscript expert.
You can tell its anti-aliasing because the "fuzziness" only appears when the fonts are small. Once you zoom in it all goes away.
Image zoomed out:
Image zoomed in

Is there a way to use custom fonts in a PDF file?

Well basically I'm finishing school in mid December so I'm just brushing up my resume and I'm wondering if there's a way to use custom fonts (in this case Calibri and Cambria) in a PDF file and make them render correctly on all computers.
Thanks in advance!
EDIT: I'm using MS Word 2007, but am open to suggestions
PDFs don't store text and fonts like other documents, they actually convert the font to vectors, that way no matter what font you use, the document displays exactly as expected. This is why searching for text inside the PDF is such a problem for 3rd party PDF Readers and why even Adobe themselves use to distribute 2 versions of Acrobat (one with text search, one without).
Another thing to keep in mind is, PDF isn't pixel exact, it's ratio exact. PDF readers generally do not use a 100% zoom level, instead most people read them at "fit to screen" or "fit to page". I point this out because I'm guessing the reason you are trying to use those new Vista/Office 2007 fonts is because of their LCD subpixel support (improves readability on LCD screens). This feature will not translate into the PDF, since the letter becomes a vector, subpixel information is lost, and even if it wasn't, becomes useless because the vector will be sized to something other than you intended at view time.
The PDF format is capable of embedding fonts, if the font has been marked embeddable by its creator. You'll have to check the software that's creating your PDF to see if it has the capability and how to enable it.
theoretically speaking, on technical side, embedding/not embedding ability, regarding the fonts, is settled with a special flag in font file (ttf or opentype or type1)
you can view this special embedding flag with any font editor program (I recommend
FontCreator (by High-logic)
http://www.high-logic.com/font-editor/fontcreator.html
with a free trial fully operative and without limitations
you can also change embedding/not embedding flag, but legally speaking, for the 99% of fonts commercially distributed, this breaks the license of font