Adobe Reader doesn't show some texts generated by PDFKit - pdf

I have written a pdf generator with PDFKit. If I open the saved PDF-file with Adobe Reader some texts won't be shown. In some other PDF-viewer programs and the viewers of modern browsers all texts will be shown correctly.
Do you know somethng about that? Is it known?
Note: All texts have the standard font 'Helvetica'. So the font formatting isn't the problem (I think). The texts which won't be shown are texts set as footer and header. Maybe the coordinates of the set texts are for Adobe Reader a problem?
My simple code to add these texts:
doc.text('My text', posX, posY);
Example PDF

Related

Problem showing a font with license restriction to pdf

I'm programming to convert a file to pdf on mac, file contains a Chinese text
using a font STFangsong which has license restriction and is not embeddable, I
tried many CMaps to encode it, but it seems the root cause is because pdf viewer(both
mac previewer and acrobat reader) does not recognize the font, as shown in the pdf
file properties, Actual Font Unknown and there is a pop message says can't find or
create the font.
The PDF 32000-1:2008 9.6.6.4 tells a guideline that when encoding truetype font,
the font program should be embedded, though no specific explanation, from my
understanding, embedding can guarantee the pdf is readable everywhere, but I do not
need this since the font is licensed, I just want it can be shown on my computer.
So my question here is does those pdf viewer has limitation on those CJK characters
when embedding is forbidden?
By the way, I used Microsoft word to write a text with the font and save word to
pdf, and it shows the font is embedded subset, does it mean Microsoft have bought the
license?

Text changed to graphics, still selectable in PDF?

I have this PDF ebook with selectable text - the handwriting - but there is no such font embedded and the letters are all different, so it's not actually a font. How is this possible?
I've worked with CorelDraw and Adobe Acrobat, but I can't understand how this works.
The left side of the picture shows the document properties, the right side shows a page of the PDF file and I selected the last 3 rows. I can copy and paste that to a text file, no problem. How was this achieved?
There are a few possibilities but the most likely is the text is being converted to outlines/paths or vectors. Some software such as Adobe InDesign and other print design apps allow you to 'flatten' a font based text into vector or paths, meaning the original font isn't required to be embedded or installed on the system. The original text data is however still present and able to be copied into a text field or word processor.

Fill PDF form field with Hebrew text (RTL)

I tried to use iText 7 community to check if it supports filling PDF form fields in Hebrew. For some reason, I can't make it work.
Here is the code I'm using:
PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDoc, false);
form.setGenerateAppearance(true);
form.getField("test").setValue("\u05de\u05d9\u05db\u05d0\u05dc");
form.flattenFields();
pdfDoc.close();
The PDF is a blank PDF page including only one text field with the following properties:
Font Adobe Hebrew
Text direction RTL
I tried with and without flattening fields.
When fields are not flattened, after opening the resulting PDF using Acrobat Reader, I see my field but it is empty. Only after I click on the field, the content of the field appears correctly. When I view the PDF on Chrome, the field doesn't appear (or it may be there but no text inside).
When fields are flattened, after opening the resulting PDF using Acrobat Reader, the field doesn't appear at all.
I precise that I created the PDF using Acrobat DC.
Any idea what is going on here?
EDIT: The test PDF can be downloaded from here
Try creating a font (not all fonts support IDENTITY_H, but Arial does). On windows this will look like this:
PdfFont f = PdfFontFactory.createFont("C:\\windows\\fonts\\arial.ttf", PdfEncodings.IDENTITY_H, true);
And then set the font to field:
form.getField("test").setValue("\u05de\u05d9\u05db\u05d0\u05dc").setFont(f);
This worked for me

Extract text from pdf using itextsharp returns empty string

I have a pdf file. The text can be extracted in Edge browser or in adobe reader after installing some fonts. Please let me know how to extract the text with itextsharp (latest version 5.x). I use this commands. Empty text is returning. But the file has 8 pages with text.
var reader = new PdfReader(bytes);
var pages = reader.NumberOfPages;
for (int i = 1; i <= pages; i++)
{
var t = PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy());
text += t;
}
The PDF
The PDF at first glance appears to be OCR'ed by an OCR program that did not realize that the pages are rotated by 180°.
For example, the OCR program on the second page started in what a PDF viewer displays as bottom left corner:
and here recognized
epnq eoⅢ9時u ez `9P...
押印S ’句OP JuP9A...
eA I臥O9叩Od n^Z小no...
This is not that bad, e.g. epnq eoⅢ... is not really unlike the ...mce bude rotated by 180°.
The OCR software appears to have a certain affinity to CJK glyphs; this impression is reinforced by the fact that the it uses fonts with an Adobe-Japan1-2 ROS and a 90ms-RKSJ-H encoding.
Text extraction
All the information above considered, though, I have some doubt that
The text can be extracted in Edge browser or in adobe reader after installing some fonts.
At least I doubt that anything similar to the actual text can be extracted, no matter how many fonts are installed. On the other hand both Adobe Reader and Edge out-of-the-box here extract the weird text recognized from the rotated letters.
iText
My observation with iText differs, while the OP reports that
Empty text is returning
I get a lot of CJK glyphs (I have added the Asian jar, though, which might make a difference). Unfortunately, though, not those found by inspection of the PDF.
As far as I remember, though, text extraction by Encoding + ROS has never been in focus during iText development up to version 5.5.x (inclusive), in particular the mixed single-byte/double-byte encoding of 90ms-RKSJ-H might not be supported.

PDF cannot display Chinese fonts in table of contents

I made a PDF file from Latex (using TexMaker).
Acrobat Reader is able to display BOTH the text and the table of contents in Linux.
But Acrobat Reader is unable to display the table of contents in Windows XP (the Chinese characters came out as boxes). However, the text is displayed correctly.
I tried to embed the fonts into the PDF but the various methods are not 100% successful, so I'm not sure if the fonts are embedded correctly or not. Anyway, the table of contents remain unreadable in Windows.
I wonder if it is really an font embedding problem? Or do I need to install these "Adobe Reader X Font Packs":
https://www.adobe.com/support/downloads/detail.jsp?ftpID=4883
My concern is that I'd like my PDF to be readable in Windows, including the table of contents (and preferably without further installations). If this is possible...
I suspect you are talking about "bookmarks" and not saying part of the text in the document is ok and part is not. PDF Bookmarks are part of the UI of the application and are not selected from embedded fonts. Therefore, the system you are running on needs to know how to handle fonts in the language(s) of choice.
See https://forums.adobe.com/thread/1144972?start=0&tstart=0
Embedding the fonts will have no effect on the bookmarks.