Use custom TTF font in a PostScript file - truetype

I'm trying to write my own PostScript file manually and want to use a custom TTF font downloaded from the web but it's not using it - either uses some other font or doesn't display the text at all. I don't have problems with the fonts installed in the system.
The commands I used were different variations of:
/FontName /TheFontName def
/TheFontName 20 selectfont
(XXXXXXXXXXX) show

You can't use a TrueType font directly in PostScript, unlike PDF PostScript doesn't support TrueType.
In order to use a TrueType font you must first convert it into a type 42 font which PostScript does support.
Adobe Technical Note 5012 documents the type 42 format

You must convert ttf fonts to pfb and pfm format to use it in postscript. There are online tools available to convert ttf fonts to pfb and pfm format.

Related

Ghostscript generated pdf content is not able to copy

I am trying to convert a postscript file which contains some telugu Font (i.e Vani Bold). After converting the file into pdf I am not able to copy the text from generated pdf file .When I see the properties of pdf file in centos document viewer it is showing like below
I am using below command to convert postscript file to pdf
bin/gs -dBATCH -sDEVICE=pdfwrite -sNOPAUSE -dQUITE -sOutputFile=/home/cloudera/Desktop/PrintTest/telugu.pdf /home/cloudera/Desktop/PrintTest/VirtualPrinter_27_09_2016_19_11_41_691.ps
I tried with ghostscript 9.19 and 9.20 as well,but no change.
Following is the link to my postscript file which I am trying to convert into pdf.
click here for postscript file
I have been struggling with this since 10 days .Please provide some solution for this.
I can tell you why you can't copy & paste the text, but I'm not sure I can provide an acceptable solution.
First, not all pdf viewers can deal with unicode characters (for example,xpdf can't, it just ignores them, while mudpf and qpdfview work).
Second, to be able to convert font glyphs to unicode characters, the font object in the PDF file must contain a /ToUnicode property. If you look at the generated PDF after decompression (mutool clean -d), you can see that the Vani font in object 8 0 doesn't have it, while both the Arial font in object 10 0 and the Calibri font in object 12 0 do.
So very likely the Vani font is missing this unicode translation information, you need to either add this information (e.g. with fontforge), or choose a different font that has this information.
Related question:
https://superuser.com/questions/1124583/text-in-pdf-turns-gibberish-on-copying-but-displays-fine/1124617#1124617

How do I find the TTF font name for Adobe Normalizer (i.e. Times New Roman)

I'm trying to use Adobe Normalizer to convert PostScript files to PDF/A.
The problem I am having is that if a font isn't found it is a hard stop. I added the "--ignorestdttfonts=off" and that helps a little bit. Here's what I'm using for my command string:
demonorm -efi --ignorestdttfonts=off -r0 -P ICCProfiles\ -B ".\Settings\PDFA1b 2005 RGB.joboptions" +n -O -O c:\NormalizerOutput InputPsFile.ps
I am using /Times-Roman in my PostScript file, and I have times.ttf as an installed font, but I am getting this error:
%%[ Error: Times-Roman not found. Font cannot be embedded. ]%%
So I have 2 questions:
Given a TTF file, how do I know exactly what font name to use for
Adobe Normalizer?
How do I substitute a font when a font is not found? The default is
to use Courier, but that doesn't seem to be happening. I explicitly
added "--allowdefaultfont=on --defaultfont=Courier" but it had no
effect.
Adobe Normalizer will always look for the PostScript font name. Unfortunately unlike other Adobe products it will not create a fontlist (.lst) file where you can easily see the "FontName" and the location of the font file on your system. So you will have to find other ways to get the exact PostScript font name.
An easy way to see PostScript font names is to open the Distiller settings file (joboptions) using Distiller. You want to use Distiller because you want this to open in a GUI. On the left side panel select "Fonts" and under Embedding > Font Source > select C:\WINDOWS\Fonts\ (assuming the TTF file is installed here). The window below will list all the fonts in the C:\Windows\Fonts location. The fonts are listed by their PostScript font names. Note that PostScript font names do not appear with spaces. Look for the Times or Times Roman font that you have installed.
This does get quite tricky with Times or Times-Roman. Is your PostScript file (input file) referring to an Type1 or TrueType font?
In the Normalizer documentation (p174) the Times or Times New Roman family appears as:
Times
Times-Bold
Times-BoldItalic
Times-Italic
Times-Roman
Times New Roman
TimesNewRomanPS-BoldItalicMT
TimesNewRomanPS-BoldMT
TimesNewRomanPS-ItalicMT
TimesNewRomanPSMT
Note that Times and Times New Roman are not the same fonts. Times is a Type1 font and if you do not have this font installed on your system then Normalizer is correct in its error "Times-Roman not found. Font cannot be embedded."
Hope this information helps.
Adobe Normalizer, as I understand it (I don't have a copy) is essentially a server version of Acrobat Distiller. It accepts PostScript as an input and delivers PDF files.
So there are several possibilities:
1) Normalizer cannot use TrueType fonts installed on the server. From your description that doesn't seem to be the case, as you say that --ignorestdttfonts 'helps a little bit' (it might be useful to know what improves...)
2) Because the missing font is Times-Roman, its simply not embedding the font because it doesn't need to. The 'base 14' fonts are assumed to be included with any PDF consumer, and they don't need to be included. To be honest, this seems like the most likely, as I would have thought that Adobe would ship the base 14 fonts with the Normalizer.
3) The TrueType font isn't available to Normalizer. You haven't said how you installed times.ttf. Did you just install it on the OS (and what OS are you using anyway ?) or did you add it to Normalizer in some fashion ?
4) You may (as you think) have the font name incorrect. The problem is that you cannot use TrueType fonts directly in PostScript. In order to be used in a PostScript program they have to be converted into type 42 fonts. It may be that Normalizer simply can't do that. Do you have any reason to think it can ? If it can, then it may require a TrueType POST table, which is optional and may not be present in your font. However, the font name would be the same as the TrueType font name. The times.ttf I have is called "Times New Roman" and is in fact an OpenType font. If you want to use a font name with spaces you will have to make a string and convert to name :
(Times New Roman) cvn findfont
If you want to check the operation of the default font, I would suggest using a font name which is not one of the base 14, eg:
%!PS
/NoSuchFont findfont 20 scalefont setfont
10 10 moveto
(Hello World) show
showpage
Run that through Normalizer and see what comes out as the font. It may well be that it simply leaves the font request in place of course.
Finally; since this is a commercial product I assume you are entitled to support, wouldn't it be simpler just to ask Adobe ?

The 14 standard PDF fonts and character encoding

I'm having difficulty producing PDFs that make use of the 14 standard PDF fonts. Let's use Times-Roman as an example.
I create a Font dictionary of type Type1, with BaseFont set to Times-Roman. If I omit the Encoding entry to the Font dictionary, or add an Encoding dictionary without a BaseEncoding set, the PDF viewer application should use the font's built-in encoding. For Times-Roman, this is AdobeStandardEncoding.
This works fine for ASCII characters. However, something more exotic like the 'fi' ligature (AdobeStandardEncoding code 174) is not displayed correctly by all PDF viewers:
Adobe Reader shows ® (unicode index 174) for Times-Roman and Ă for Times-Italic
SumatraPDF (wine) shows ® for both fonts
Mozilla's PDF.js shows the 'AE' ligature both fonts
All other PDF viewers I've tried, display the 'fi' ligature properly. They also display the € symbol correctly, which is additionally mapped using the Differences array in the Encoding dictionary (because it is not included in AdobeStandardEncoding):
Apple Preview/Skim
GhostScript
PDF-XChange Viewer (wine)
Foxit Reader (wine)
Chromium's internal PDF viewer
Evince (homebrew)
Opening Adobe Reader's Document Properties window shows:
Times-Roman
Type: Type1
Encoding: Custom
Actual Font: Times-Roman
Actual Font Type: TrueType
I suspect the fact that a TrueType font is being used instead of a Type1 font might be related to the problem. The PDF specification:
StandardEncoding Adobe standard Latin-text encoding. This is the
built-in encoding defined in Type 1 Latin-text font programs (but
generally not in TrueType font programs).
It also says WinAnsiEncoding and MacRomanEncoding can be used with TrueType fonts. So should we avoid using the built-in or StandardEncoding for the standard 14 fonts? Its effects seem to be undefined. It seems Adobe Reader doesn't bother performing a proper mapping from glyph names to glyphs in the TrueType font being used.
Will providing a Differences array when using the Win or Mac encodings produce proper results? Since these map codepoints to Type1/Postscript glyphs names, there is no direct link to TrueType glyphs.
EDIT Mmm, I have a feeling the Font Descriptor Flags might be important for these standard fonts. I set the flags to 4 up to now for all fonts, which seemed to work fine for True/OpenType fonts.
Turns out the Flags in the FontDescriptor dictionary is important. For Times, the Nonsymbolic flag (bit 6) needs to be set. The fact that Times is actually being typeset using a TrueType font has nothing to do with it.
To use the built-in encoding of the font, the Encoding entry of the Type1 Font dictionary should not be set. You may only add an Encoding dictionary (with BaseEncoding omitted) if it contains a non-empty Differences array, or Adobe Reader will error.
With these precautions, the generated PDF displays correctly on all 9 viewer applications listed above.

Pdf partial font embedding with iText

I am asked to include partial font into a pdf.
I think I will use iText and I found how to embed the font but I found no clue about partial embedding.
Does anybody know if partial embedding is automatic ? Or maybe iText does not have this feature ?
Thank you.
When does iText embed the full font, a subset, or no font?
In this answer, it is assumed that you use the BaseFont class and the Font class like this:
BaseFont bf = BaseFont.createFont(pathToFont, encoding, embedded);
Font font = new Font(bf, 12);
In this snippet:
pathToFont is the path to a font file (.ttf, .ttc, otf, .afm),
encoding is an encoding such as "winansi", BaseFont.IDENTITY_H,...
embedded is a boolean: true or false.
Will iText embed the font or not?
That's determined by the embedded parameter:
If it is false, the font isn't embedded.
If it is true, the font is embedded, except in the case of the Standard Type 1 fonts or Type 1 fonts for which the .pfb file is missing or CJK fonts.
Regarding the exceptions:
The Standard Type 1 fonts are 4 flavors of Helvetica (regular, bold, italic, bold-italic), 4 flavors of Times Roman (...), 4 flavors of Courier (...), Symbol and Zapfdingbats. iText ships with 14 Adobe Font Metrics (AFM) files. These files contain the metrics that are needed to calculate widths of glyphs and words. iText doesn't have the necessary Printer Font Binary (PFB) files that are required to embed the font.
Type 1 fonts are stored in two files: an AFM file and a PFB file. If you provide an AFM file, iText will look for the PFB file in the same directory. If iText doesn't find any PFB file, the font can't be embedded.
CJK stands for a series of Chinese, Japanese and Korean fonts that are available in downloadable font packs. It's a special type of Asian fonts; Asian fonts in .ttf, .otf or .ttc files can be embedded.
Will iText subset the font or not?
iText will always try to embed a subset of the font, not the whole font, except in case you provide a Type 1 font (AFM and PFB file). In case a Type 1 font is provided, the full font is embedded.
Can iText embed the full font?
Yes, you can force iText to embed the full font by adding this line:
bf.setSubset(false);
However, this value will be ignored in case you use the encoding Identity-H because that's how it's described in ISO-32000-1. iText will only embed full fonts that are stored inside the PDF as a simple font (256 characters); iText will never embed fonts that are stored as a composite font (up to 65,535 characters).

Encode fonts in winansi with GhostPCL in PCL to PDF conversion

I'm trying to create a PDF with fonts encoded as winansi instead of custom.
The source file is PCL and I use ghostpcl to convert it to PDF using pdfwrite device.
The PDF is created successfully. However, the font encoding (when checked with pdffonts) is 'custom', but I want it to be 'winansi'. How can I achieve this?
You almost certainly can't do that, you certainly have no control over what pdfwrite chooses to do with the Encoding. Without seeing the input PCL file I can't comment on why the Encoding isn't winansi, but my guess would be that there is insufficient information in the incoming PCL to determine what the font encoding is, and so the only alternative is to use a custom encoding.
If you are trying to make an editable/searchable PDF file from PCL input you cannot reliably do that.