Print to pdf that is searchable and selectable from existing pdf that is selectable and searchable - pdf

I am trying to print a section of an existing pdf to a new pdf. The original is searchable and selectable but the new pdf cannot do either. I am using "adobe acrobat reader DC" and print via "Microsoft Print to PDF". Unsure if there is any other relevant information.

After searching for a period of time I could not find an answer that allows for direct PDF to PDF print.
I did find a workaround however.
I downloaded a free software called PrimoPDF. Once installed, PrimoPDF becomes a printer option within Adobe acrobat reader. I then selected my desired pages and printed to PrimoPDf instead of Microsoft Print to PDF. This Generated a .ps file. I then imported the .ps file into PrimoPDF application and was able to generate a .pdf from that. The newly generated pdf was searchable and selectable and exactly what I needed.
Hopefully someone else finds this useful in the future.

Generally refrying (printing to PostScript then converting back to PDF) is a bad idea. The reason that Microsoft Print to PDF created a file that wasn't searchable is because when Adobe Reader detects that the printer it is targeting isn't capable of rendering the PDF correctly because of any number of reasons, like it doesn't have the right fonts for example, it will render the PDF itself and send an image to the printer. A simpler PDF probably would have worked just fine.
You are much better off getting a tool that will simply allow you to extract the pages you need to a new file rather than printing.

Related

PDF with fillable form fields + "Save As" = PDF with fixed text - how?

I'm creating a PDF using iTextSharp and it contains some fillable form fields. What I need, is to somehow set it up so that when those form fields are filled in and the resulting PDF is saved (in one of the commercially-available PDF readers like Adobe's Reader), I need those form fields to be fixed text (no longer editable).
Is there any way to do this?
As a comment suggests, this sounds like "flattening the document".
The issue with that process is that it is not available in (Adobe) Reader; it would require Acrobat, or server-side help.
On the other hand, some mobile PDF viewers do actually offer flattening when saving.
The workaround for Reader is to set the fields to read-only when saving the document. You would do this in the willSave Document Action by looping through the fields and setting them to readonly.
Simply Print your document as PDF. This will flatten the file.
For this
1. Install Adobe PDF Printer or CutePDF or some similar tool.
2. From your document. Select File -> Print.
3. Select Printer as the tool you installed in Step 1.
4. Your document will be a flattened, non-editable PDF now.

itextsharp joined pdf size too large

I have an asp.net (c#) web application.
I joined together some pdf files on serverside with itextsharp.
The result pdf is too large (>1M) to be downloaded in our environment.
I checked fonts list in Adobe Reader and Verdana font embedded 10 times. It could be a problem. I don't know why? I use iTextSharp PDFCopy for merging.
If I capture the file by fiddler and than print it out by bullzip pdf printer the size become the half (500KB)!!!
I can't figure out what bullzip pdf printer does to reduce the size???
All the joined pdfs are mosty text only a couple of small images.
Interesting when I try to copy/paste text from the original pdf I can, but I can't from the bullzip printed version (i get only rectangles when paste). That is no problem by the way because I don't need to edit or search by text, I only need to print it from the browser.
I need some .net library to do the same with the pdf before I send it to the browser.
Can anyone help me?

Fix PDF with unreadable characters

Example PDF page: https://db.tt/qRcF000k
This is sample page from a document, where copied text shows as question marks in my favorite reader SumatraPDF (mupdf) just the same as in Adobe Acrobat. But my main problem is that I can not search this document because of this, nor I can index it.
OTOH, xpdf's pdftotext extracts correct text.
In Adobe Acrobat if I use "Copy as formatted text", correct text is written to clipboard, although I still can't search from Acrobat.
Also if I open the linked page in Firefox's built-in PDF reader I can correctly copy the text.
Can GhostScript perhaps be instructed to correct this issue, which I can not describe differently then as 'unreadable characters'?
The PDF file uses subset fonts with non-standard Encodings and no ToUnicode CMaps. So no, you can't have Ghostscript 'correct' this file.
In fact I can't see how anything can possibly be extracting sensible text from this, and indeed my version of Acrobat (Pro X and Reader XI) can't copy meaningful text and don't appear to have a 'copy as formatted text' menu item, can you tell me where to find this ?
However, I notice that the PDF file has actually been created by Ghostscript (version 9.14) so possibly you mean 'starting with a different input file, which I haven't given you, could I have generated a PDF file where the text could be copied', to which I can only say 'I don't know', it depends what was in the original input file .

Is it not possible to print a pdf from a hyperlink?

I have looked for weeks and I keep hitting dead ends. I know you can create a text or image link and tell it to "print page" in a browser. But so far, I can't get it to print a document, specifically a pdf. I would like the print dialog to show after the link is clicked and yes, the pdf linked to has been printed.
Why does this seem to be such an impossible feat? I have seen it work in a Flash movie, but since I cannot access the native file I cannot see how it was done.
Any advice?
Thanks.
Many of today's printers support direct PDF printing. Lexmark, HP, Xerox to name a few all have this on most of the 'business' printers. On these devices simply sending the PDF file directly to the device over LPR, port 9100, or some other mechanism will result in a printed document. Some devices even support URLs. I do know that Lexmark had some devices that a URL could be sent to the printer as as long as it had access to the URL it would pull the document and print. In this case it supported basic HTML, JPEG, TIF, and PDF.
Hope this helps.
A PDF must be rendered as an image before it can be printed. Usually when you're printing a PDF file on your desktop you could simply right-click on the file and select Print and if you have Adobe Reader or an alternative application set as your default PDF viewer, then the PDF that you have selected will be opened automatically -- at this stage the PDF is rendered as an image -- and then the printing process will begin.
But if there is no access to a PDF viewer that can render the PDF and then print it, then you won't be able to print the PDF. Usually if you have Adobe Reader, Foxit Reader, etc, installed then when you click on a URL to a PDF then the PDF will open within the PDF viewer within the browser and you will be able to print it.
Alternatively, you could find a PDF SDK that silently renders a PDF as an image and then sends that to the printer, without the need to have a PDF viewer installed on your machine.

difference between microsoft report viewer and adobe pdf reader tools?

i would like to display a pdf on my winform and am thinking of using of those tools in my vb.net application. does anyone know the difference between the two?
Microsoft Report Viewer reads report definition files and displays the report. Adobe's PDF reader displays PDF files.
Report definition files != PDF files, so you would need to make sure that you use the right tool for the right job. If you need to read PDFs, use a PDF reader.
As for consuming a PDF on a WinForm, you could host a WebBrowser control and point to the PDF. Alternately, there are several WinForm control manufacturers that read and display a PDF file (though I've not used any of them so would not be able to recommend one over another). Examples would be:
http://www.tallcomponents.com/
http://www.skysof.com/