Remove font using iText - pdf

I have a problem with my PDFs. They have a default font, Helvetica. This font is unused, but I need to develop a script that automatically deletes this font.
After asking two questions:
Unset PDF font with script
xhtml2pdf doesn't embed Helvetica
I finally discover iText. I've been trying to work with this library, and I made a few successful tests with Unembeding Fonts (example). But I can't find anything about deleting a whole font from de PDF.
Thanks a lot

Related

How to handle PDF font fallback perfectly when generating it?

I'm currently working on a project that can convert HTML canvas to PDF, user can select the font and draw the text in the canvas and export as a PDF(vector), but there's a problem that user can enter other language text that the font doesn't really support it. It's shown fine in the canvas because the browser did the font fallback mechanism maybe to grab the system font as a fallback, but in the exported PDF it's all corrupted. I've embedded the font in the PDF but the font doesn't have corresponding glyph, and the PDF reader like adobe doesn't have font fallback mechanism so it all become .nodef
I have two ideas but that aren't really satisfying.
1. Collect all glyph from each sentence and create a new font
Walk through each char and check if current font has corresponding glyph, if so, adding it to the new font list, if not, using an alternative font from the font stack #1 as the fallback to get the glyph and adding it to the new font list, then finally converting it as a new font and embedding it in the new PDF.
It seems good but in reality the performance of generating new font is terrible.
(I was using Opentype.js to load and write a new font, when we exported the font by using toArrayBuffer method, it took 10 mins for 6,000 words)
#1, Font stack is a stack like ['Crimson Text', 'Pt Sans', 'Noto Sans'], if the first font can't find corresponding glyph then go next until the end we give up.
2. If encountered any missing char, change the font-family of that sentence to Arial Unicode MS or Noto
It's pretty simple but it converts every word in the sentence to Arial Unicode MS or Noto, besides, it's hard to find a good font that contains most of language's glyph and we can't use font stack mechanism because we only can use one font in one sentence.
My goal is to make the exported PDF similar with the canvas that user drew, hoping someone can give me some direction 😥, many thanks
The usual solution would be to embed all four fonts in your stack plus noto (all suitably subsetted, preferably), and switch between them mid-word as required.
Building a new frankenfont from the fonts as you suggest is not required, though I admire the ambition!

open a PDF file with automatically replaced Fonts

I am not a programmer, but a normal user who uses Linux.
I want to use Ghostscript to DISPLAY Pdf files, not to CREATE Pdf files. (I have never used Ghostscript until now).
But I want Ghostscript to automatically replace all fonts with other fonts when I open the PDF. No matter if the fonts are embedded or not.
With which fonts should the fonts be replaced?
Answer: I want to create a list of fonts, that I want to be available for replacement.
But which of these fonts on the list should be used?
Answer: The one that best matches the metric of the font to be replaced.
Is it possible to do this somehow?
You can't get Ghostscript to do what you are asking. If a PDF file contains fonts Ghostscript will use those fonts, it will only substitute if it cannot find an embedded font.
The reason for this is simple; the font embedded in the PDF file is the correct font. It's Metrics are correct, and the mapping form character code to the appropriate glyph selector in the font will be correct.
It's also a non-trivial problem to select from a list of fonts the one which 'best matches the metrics of the font to be replaced'. What characteristics should be considered ? How should those be determined ?
When a font is not embedded then Ghostscript will consult its own list of fonts and CIDFonts. Both of these lists can be customised, the documentation is here
But since a substitute font is always going to be a compromise, you can't tell Ghostscript not to use the embedded fonts in a PDF. Well technically you could, by modifying the PDF interpreter, but you say you aren't a programmer, so I doubt you will want to try that.

Is there any way to prevent losing text when converting a PDF to a PNG when using <CFPDF>?

Using the following code to generate thumbnails from PDFs (ColdFusion 8):
<cfpdf
action="thumbnail"
source="#LOCAL.PathToMyPDF#"
destination="#LOCAL.ImageDestination#"
format="png"
scale="100"
resolution="high"
overwrite="true"
pages="1" />
Sometimes it works great and generates a beautiful PNG representation of the first page. However, many times, it ends up creating a PNG with none of the text that's in the PDF, or with the text mangled or background images out of arrangement.
Is there any way to prevent this? I'm open to using a non-commercial java library, if necessary.
Without looking into this too deep, I would think you are having a font problem.
Try to run that bit of code with this parameter nofonts = "true" (which removes font styling) and see if you get your text (not styled).
If that works then you may need to register your fonts in Coldfusion (so Coldfusion has access to the fonts library). If you are not sure what fonts your PDF uses then you can check file, properties and click the font tab to see the fonts your PDF uses.
Check this link for more explanation on Coldfusion and fonts.
Again, I am not sure about your server and font set up because it wasn't mentioned in your post, so this is my best guess for you...
:)

How to bold a text in PDF?

I'm developing a new function to "my" program. This function is able to write PDF files by the simple way, making a simple text file with some codes of PDF standard.
I'm trying to understand how it works yet, but my first problem is about how to apply bold on some line of my document.
I've already downloaded the PDF REFERENCES GUIDE, but I've not found nothing about it.
Any idea?
PDF is not like HTML where you can apply formatting tags for emphasis. As you've read in the PDF reference, all that you do in PDF is to setup a graphics environment (colours used, fonts used, etc) and then put text on the page.
If you want to have something show in bold, use a font that is bold. If you want to have something show in italic, use a font that is italic.
Older software used dirty tricks to create "bold-alike" text, but the good (and easy) way to do it is to make sure you select the correct font before you start drawing text.

Add PDF font to JasperReport export

I am using iReport to create a series of reports. In iReport my default font is set to "SansSerif"; on my machine (Ubuntu Linux) this is actually DejaVu Sans. Ultimately the reports need to be rendered as PDF files. When a PDF is generated the text font is actually Helvetica and is causing formatting issues. Ideally the font in iReport would be the same as the PDF font. That is where my issue resides.
I have tried changing the net.sf.jasperreports.default.pdf.font.name setting to 'DejaVu Sans' but that throws an error about the font not being found. From what I understand it is actually iText creating the PDF. Is that correct? In the iText jar Helvetica is embedded in the jar. Does the same thing need to be done to the other fonts? How does one go about that?
I have researched this and tried all kinds of things. Any ideas would be appreciated.
To install missing fonts in iReport, Access the following sub-items from the menu bar
Tools > Options >Fonts > Install Font
Add fonts files e.g garamond.otf,
Add font family details
Select locale of your country
Manage font mapping to avoid the missing font property in OS
After adding all required fonts click on Export as extension to save the jar extension
Add this Jasperreport-font.x.x.x.jar on your project library or classpath