open a PDF file with automatically replaced Fonts - pdf

I am not a programmer, but a normal user who uses Linux.
I want to use Ghostscript to DISPLAY Pdf files, not to CREATE Pdf files. (I have never used Ghostscript until now).
But I want Ghostscript to automatically replace all fonts with other fonts when I open the PDF. No matter if the fonts are embedded or not.
With which fonts should the fonts be replaced?
Answer: I want to create a list of fonts, that I want to be available for replacement.
But which of these fonts on the list should be used?
Answer: The one that best matches the metric of the font to be replaced.
Is it possible to do this somehow?

You can't get Ghostscript to do what you are asking. If a PDF file contains fonts Ghostscript will use those fonts, it will only substitute if it cannot find an embedded font.
The reason for this is simple; the font embedded in the PDF file is the correct font. It's Metrics are correct, and the mapping form character code to the appropriate glyph selector in the font will be correct.
It's also a non-trivial problem to select from a list of fonts the one which 'best matches the metrics of the font to be replaced'. What characteristics should be considered ? How should those be determined ?
When a font is not embedded then Ghostscript will consult its own list of fonts and CIDFonts. Both of these lists can be customised, the documentation is here
But since a substitute font is always going to be a compromise, you can't tell Ghostscript not to use the embedded fonts in a PDF. Well technically you could, by modifying the PDF interpreter, but you say you aren't a programmer, so I doubt you will want to try that.

Related

Ghostscript PDF to PDF/A conversion font issues

I am exploring tools to convert PDF documents to PDF/A. Ghostscript seems to give out of the box support for such a conversion. One issue seems to be that some true type fonts that are a part of the original PDF document are not converted correctly. If I copy a text from the converted PDF/A document, and paste it in notepad, the copied text appears to be garbled text.
The original document text can be copied to notepad just fine.
I am using the following script:
gswin64 -dPDFA -dBATCH -dNOPAUSE -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite -sPDFACompatibilityPolicy=1 -sOutputFile=FilteredOutput.pdf Filtered1Page.pdf
I have uploaded a sample 1 page source PDF in Google Drive:
SampleInput
A sample output PDF/A document generated from the command is in Google drive here:
SampleOutput
Running the above query on this PDF in a windows machine will reproduce the issue.
Are there any settings / commands make the PDF/A conversion to be handled properly?
Copy and paste from a PDF is not guaranteed. Subset fonts will not have a usable Encoding (such as ASCII or UTF-8), in which case they will only be amenable to cut/paste/search if they have an associated ToUnicode CMap, many PDF files do not contain ToUnicode CMaps.
Of course, the PDF/A specification states (oddly in my opinion) that you should not use subset fonts, but its not always possible to tell whether a font is subset (not all creators follow the XXXXX+ convention), and even if the font isn't subset there still isn't any guarantee that its Encoding is one that is usable.
Looking at the file you have posted, it does not contain one of the fonts it uses (Arial,Bold) and so Ghostscript substitutes with DroidSansFallback, and the font it does contain (FreeSansBold) is a subset (FWIW this font doesn't actually seem to be used....). The fallback font is a CIDFont, so there is no real prospect of the text being 'correct'.
I believe that if you make a real font available to Ghostscript to replace Arial,Bold then it will probably work correctly. This would also fix the rather more obvious problem of the spacing of the characters being incorrect (in one place, wildly incorrect), which is caused by the fallback font having different widths to the original.
NB as the warning messages have already told you don't use -dUseCIEColor.
The fact that you cannot copy/paste/search a PDF does not mean that it is not a valid PDF/A-1b file though, so thsi does not mean that the creation (NOT conversion) of the PDF/A-1b is not 'proper'.

Which Chinese font is commonly supported by PDF readers of Chinese people?

I am generating PDF files which contain English and Chinese characters (using the Ruby Prawn library). I don't want to embed a Chinese font file in the generated PDF files, because these files need to stay small. So I'm wondering if I could just mentioning a Chinese font name in my PDF files, and have the PDF readers correctly rendering the Chinese characters because the PDF readers would already have the Chinese font file.
Is that something sensible? If so is there any commonly used Chinese font that one can expect to be installed in most of the PDF readers used by Chinese people?
The best way to ensure that a PDF file can be displayed on a any reader, is to use partially embedded fonts (also known as font subset). In PDF, you don't need to include the whole font with your document, having a subset with just the glyphs that were used in the file is enough for the file to be portable.

Possible to edit PDF without embedded font installed?

I have a PDF and need to edit its text.
It has an embedded font and I'm not able to find the font to install. Is it possible to edit the text and maintain its embedded font when I don't have the font installed?
I'm editing with Acrobat X and its warning me that I don't have the font and forcing me to change the font of the text I want to edit to one that I have installed. I've Googled for a couple hours and found the font family, but not the variation that's embedded. Because the font is already in the document, I would have thought Acrobat or another software program could tap it to allow me to edit, though I'm guessing I'm missing something.
It depends...
1.
Most likely, your embedded font is a subset of the full font. That means that it contains only these glyphs (shapes that represent printable characters) which were required as representations for characters used by the original PDF.
If your edit wants to insert a character that wasn't present in the original PDF, Acrobat (or any other editing method) has to use a different font. The font embedded in the PDF simply doesn't have glyph that is suitable for your edit!
Also, your subsetted font's name is not exactly like the full set font's name: it uses has a randomly composed 6 uppercase character prefix with a +-sign to build the used font name, like ABCXYZ+Arial.
You could employ the free (as in beer) Acrobat Plugin FontReporter to generate a list of all glyphs contained in your font.
You can also use this answer:
"How can I extract embedded fonts from a PDF as valid font files?"
to extract the font in question.
Then open the extracted font (which will be the subset variant, mind you!) in FontForgehttp://fontforge.github.io/en-US/), and there check two things:
the original font's license
the original font's creator
With that knowledge you could then definitely find the variant of the family that is embedded (or rather: that was present when the PDF generating and font embedding software created the subset).
2.
If the (subsetted) font you are looking at uses a so called "custom encoding", it may be almost impossible to edit the PDF file with standard tools (even if you have the same font locally installed that was used to create the PDF file).

How to replace or modify the font or glyphs embedded in a PDF file?

I want to replace the font embedded in an existing PDF file programmatically (with iText).
iText itself does not seem to provide any data model for glyphs and fonts, but I believe it can let me retrieve and update the binary stream that contains the font.
It's OK even if I don't know which glyph is associated to which font - what I want to do is just to replace them. To be precise, I want to embold all glyphs in a PDF document.
Replacing fonts in rendering time is not an option because the output must be PDF with all information preserved as is.
Is there anyone who has done this before with iText or any other PDF libraries?
PDF files define a set of fonts (ie F0, F1, F2) and then define these separately so you could theoretically rewrite the entry for F0. You would have to ensure the 2 fonts have the same spacing (or you will have to rewrite the PDF as well), and probably hack the PDF manually.

Fully Embedding True Type Fonts into PDFs

I am having problems creating PDF documents with fully embedded True Type fonts. I am printing from MS Word (and Indesign) to the Adobe 9.0 print driver. I can get .otf fonts to embed with no problem, but .ttf files will not embed. Is it possible to fully (not subset) embed these fonts? I am specifically having problems using WingDings. With other fonts, I have been able to find and purchase .otf versions and use those, but it does not appear that wingdings is available in this format and I do not know of another way to fully embed bullets (both sqaure and round).
The license for WingDings doesn't appear to allow you to fully embed them -- or too look at it another way, Acrobat doesn't appear to believe that it can fully embed them (and so subsets them instead). I'm not a copyright lawyer, so I'm not sure precisely what's allowed here, but here's some info that might help.
Install the font properties extension from Microsoft. This will give you much more information on the fonts properties. Once it's installed right-click on a WingDings font and and click on the 'Embedding' tab. You'll see this message:
"Embeddability for this font: Editable embedding allowed.
Editable embedding allowed: fonts may be embedded in documents, but must only be installed
temporarily on the remote system."
Then read this article from Adobe about Embedding Permissions. And this forum discussion might be of some use too.
I tried print a Word document which included WingDings to the Adobe PDF printer driver (Acrobat 8) and not matter which settings I tried, I was unable to get it to fully embed the font.
My guess is that Adobe interpret "Editable embedding allowed" to mean that you can only embed characters for the font which were included in the original document (i.e. embedded subset) and they are also the only ones which you can edit in the PDF.
I would try adding an additional page to the document that included every character in the font. Then use a different (non-Adobe) tool to delete that page. I don't have Windows so I can't tell you specifically how to do it. I can only tell you that I've used these kinds of tricks on other systems.