Show MigraDoc/PdfSharp Document on screen - pdf

I want to use MigraDoc/PdfSharp to create and store PDF documents.
Is there a way to show these documents in an application on-screen? I'd like to show the print in my program rather than starting Acrobat Reader with the document name.
I considered storing the print using XPS instead of PDF, but then I'd need to way to convert XPS to PDF for mailing it to customers. And I don't want to save the same print in two formats for space reasons.

MigraDoc can save files in its own format "MigraDoc DDL". You can preview MDDDL on the screen, create PDF or RTF from it or print it.
Disadvantage: images are not included in the MDDDL file (OTOH this can be an advantage as images can be shared between several documents).
You can ZIP document plus images for storage.
PDFsharp can create PDF files from XPS (but this is in a beta state and not fully operational).

Related

Print to pdf that is searchable and selectable from existing pdf that is selectable and searchable

I am trying to print a section of an existing pdf to a new pdf. The original is searchable and selectable but the new pdf cannot do either. I am using "adobe acrobat reader DC" and print via "Microsoft Print to PDF". Unsure if there is any other relevant information.
After searching for a period of time I could not find an answer that allows for direct PDF to PDF print.
I did find a workaround however.
I downloaded a free software called PrimoPDF. Once installed, PrimoPDF becomes a printer option within Adobe acrobat reader. I then selected my desired pages and printed to PrimoPDf instead of Microsoft Print to PDF. This Generated a .ps file. I then imported the .ps file into PrimoPDF application and was able to generate a .pdf from that. The newly generated pdf was searchable and selectable and exactly what I needed.
Hopefully someone else finds this useful in the future.
Generally refrying (printing to PostScript then converting back to PDF) is a bad idea. The reason that Microsoft Print to PDF created a file that wasn't searchable is because when Adobe Reader detects that the printer it is targeting isn't capable of rendering the PDF correctly because of any number of reasons, like it doesn't have the right fonts for example, it will render the PDF itself and send an image to the printer. A simpler PDF probably would have worked just fine.
You are much better off getting a tool that will simply allow you to extract the pages you need to a new file rather than printing.

How to convert .inp (InPage Urdu) files to PDF

If you have an InPage Urdu file and you want to convert it to a pdf file for viewing then what should you do?
There can be many ways to achieve this task. The one which I found out to be most easy is described below:
You need MS Word or Some other PDF Maker that can convert multiple images into a single PDF.
I used MS Word.
First, Open the file in InPage, go to File, Export Page. From this menu you can export all the pages as images and then drag drop these images to MS Word and Save as PDF
Alternatively you can select Print option from File menu in InPage and send it to Adobe PDF if you have it installed.

compress pdf by c# and adobe printer

One of my friend scans a lot of pages of documents and saves them as a pdf.
The size of the resulting pdf is 1GB, when I reprint this pdf using adobe pdf printer, the size of my file changes and is reduced to 80MB.
I set up Adobe Acrobat X Pro to open pdfs and Adobe Acrobat X Pro sets up a virtual pdf printer for me.
The image quality in the second pdf is very good and the most important thing is the difference in filesize.
Now how can I do this in a c# program? I want to write a piece of c# code to do this automatically.
I have about 500 pdf files and size of these files is very large and I want to reduce the size of them.
I need a c# code to get the file path and print that file using Adobe pdf printer and get a pdf file to me, or I want to be able to set a export path for the output pdf. I tested some dlls to do this.
For example iTextSharp or PDFSharp-MigraDocFoundation-1_32 or sharpPDF_2_0_Beta2_dll among many other things.
But these are not nice and working with them is not easy for me. I just want a method or class or a fast component to do these.
Please remember we wanna do this with Adobe Acrobat X Pro.
Thanks

PDF-file gets big after processing

I have a system that generates PDF-files using itextsharp and mails them to my users. And the files grows in a way that is not ok.
I start of with one 1 page word document 28 KB.
I print this one page word document using adobes printer and pdf file gets 73 KB.
I open the document in Adobe Acrobat X, insert my forms and save, 1055 KB.
I load the document in itextsharp and set the 30 different values and now my file is 2031 KB.
Is there any compression flags or tricks that can be set in itextsharp or in Adobe that keeps my file at ~73 KB. I don't add any images or any other media. just text.
BR
Andreas
I've had similar fun with PDF file size growing. I've gotten form template generation down to a process where I get my template down to the smallest size possible before I send it to itextsharp to have it populate it.
In your step 3 you are adding your acrofields for your form. After you've finished adding all your fields and saved the document, Used the "Reduce File Size..." option to help shrink the document a little bit. Then you can also use the PDF Optimizer to further reduce file size. I personally use a work around when I use the Create PDF from multiple files feature, but only add the one document I'm working on and then select the smaller file size option for lower quality optimized PDFs.
Then in step 4 it depends on how you are populating and generating the PDF files. Our process uses the template PDFs generated and copies each page onto a new document with the form fields filled. When copying instead of using the PDFCopy class we use the PDFSmartCopy class which will copy content to the new document, but will not duplicate content that is identical. After switching to the smart copy class we saw a significant reduction in file size generated by itextsharp.
Hope this helps.

OCR within an x,y window of a pdf

I need to find an open source or linux based utility that allows me to set an x,y coordinate in a setup file. I would like to then sequentially open pdf's and look in the documents for first, last name and account number and save the file with a file name consisting of last name and file number.
You may want to read some of these answers first :
A Java Library for text extraction from PDF documents preserving empty spaces and lines
How to extract text from a PDF?
How-to extract text from a pdf doc within a specific rectangular region?
The answers above are not Linux specific.
Most PDF documents do not need to be OCR'ed as the text is contained within the PDF. The hard part is extracting in. The Java version of iText (http://itextpdf.com/) is probably the best toolkit under Linux to extract the PDF text strings. Another option may be http://pdfbox.apache.org/
If the text you need to extract is actually an image then you will probably need to convert the whole PDF page to image format such as TIFF and pass that into an OCR engine such as Google Tesseract OCR.