Creating Thumbnail from PDF without Adobe SDK - pdf

I've been looking for ways by which I can generate Thumbnails from pdf, as shown in the explorer. But the problem is that without Adobe Pro, the free version does not expose all ihe COM interfaces. Is there any other way? please help.

Ghostscript (which is what ImageMagick uses) will generate images in a wide variety of different image formats... if you need something really obscure then use the imagemagick wrapper, otherwise, I prefer the straight dope.

If you can afford a commercial option, you could use Amyuni PDF Creator ActiveX for this task, (or .Net version if that suits your needs better). Using this product you can create jpg/png/bmp images from the first page of your PDF files with the specified resolution, and then use them as thumbnails.
Disclaimer: I am part of the development team of this product.
Here are other SO questions proposing other approaches (not involving COM):
Using ImageMagic in command line
Thumbnail of a PDF page (Java)

Related

Include custom fonts in PDF

I have a question about generating PDFs with wkhtmltopdf. I know it's possible to use custom fonts in my html. But I think it's required that the operating system viewing the pdf has installed these fonts. Correct?
My question is whether it's possible to include these fonts in the PDF? So when the PDF is generated I can send it to a print office to print 50 copies. And they see the pdf exactly the same as I, without having these fonts installed.
This is certainly possible.
It's called "embedding a font" in pdf lingo.
Most pdf generation libraries should support this.
Pdf comes in different flavors (standards). One of the standards pdf/A is meant for long term storage (the A stands for archiving). The idea being that the document look and feel should be preserved as much as possible. In order to achieve this without depending on the operating system (and the fonts it may be shipped with), it is required that the fonts are embedded to fulfill the pdf/A standard.
https://en.wikipedia.org/wiki/PDF/A
I don't know how to do this in the library you are using. But I do know it's possible with iText.
This is a great tutorial on it, which aside from giving you more information about iText, will also illustrate the problem with custom fonts in a very visual way.
https://developers.itextpdf.com/tutorial/using-fonts-pdf-and-itext

API for PDF Library

I need to build a small PDF library that will display many catalogs, the user will be able to view the document and go thru pages but he will not be able to download or share the documents in any way, somehow to work like Google Books (here an example).
I have in mind something like the Google Drive API or some kind of Scribd API, but I don't know if one of those will work, I would like to know if there are more options for these application or the mentioned before will do the job.
Edit: Forgot to mention, all this done in a web browser.
In principle all you need would be the ability to render pages from a PDF file into an image. Your application (you didn't mention where you want to build this) is then responsible for displaying the images, scrolling, moving from page to page etc...
If this is correct there are multiple possible libraries that can do this:
- ImageMagick can convert PDF to images (http://www.imagemagick.org)
- GhostScript has extensions for PDF and can convert PostScript or PDF into images and other formats (http://www.ghostscript.com)
- I'm sure there are many, many more...
There are also a number of commercial tools, for example those from Adobe (licensed through DataLogics, http://www.datalogics.com) and callas software (http://www.callassoftware.com - I'm affiliated with this company)

How to remove duplicate pages from a PDF?

I have a pdf of an ebook which has many pages duplicated, and I would like to remove them.
There are many PDF toolkits that could do this. You don't say which language or framework, so it's hard to recommend one.
You could also use Adobe Acrobat if you don't want to do it programmatically.
Disclaimer: I work for Atalasoft
Our DotImage Document Imaging Toolkit has PDF page manipulation classes. Tutorial here:
http://www.atalasoft.com/products/dotimage/white-papers/building-pdf-documents-with-dotimage
It shows adding pages, but we also support removing (and reordering)

Anyway to automatically convert DWF to PDF?

Our eTendering solution, www.monaqasat.com, currently works exclusively with PDF documents for various reasons, some of them being security. We are being asked if we can support DWF documents. For this to happen, we would need to find a way to automatically convert DWF documents to PDF, using some kind of Unix application.
Does anybody know any such application, preferably using Rails or Java?
Thanks,
.Karim
http://www.autodwg.com/pdf/
http://www.dwgto.com/
http://www.aidecad.com/
http://en.wikipedia.org/wiki/List_of_PDF_software
http://www.cogniview.com/convert-pdf-to-excel/category/pdf/
Suggestion would be to install a software printer call its APIs and pass dwf and get back pdf and then apply security as needed.
Autodesk has its DWF Toolkit available at
http://www.autodesk.com/dwftoolkit
It contains full source code in C++ to read & write DWF files, so it should be reasonably easy to make it run under Linux and to use a PDF library to write the output.

How to convert Word and Excel documents to PDF programmatically?

We are developing a little application that given a directory with PDF files creates a unique PDF file containing all the PDF files in the directory. This is a simple task using iTextSharp. The problem appears if in the directory exist some files like Word documents, or Excel documents.
My question is, is there a way to convert word, excel documents into PDF programmatically? And even better, is this possible without having the office suite installed on the computer running the application?
Office 2007 allows for this. I have found PDFCreator to be good, the VBA is included in sample files, and have heard that CutePDF is also good. PDFCreator and CutePDF are free.
To work without Office, you would need viewers, as far as I know:
http://www.microsoft.com/downloads/details.aspx?FamilyID=c8378bf4-996c-4569-b547-75edbd03aaf0&displaylang=EN
http://www.microsoft.com/downloads/details.aspx?familyid=95E24C87-8732-48D5-8689-AB826E7B8FDF&displaylang=en
I needed to do this myself, but managed to get it done with .Net and without 3rd party tools:
MSDN: Saving Word 2007 Documents to PDF and XPS Formats
Pretty simple, about 50 lines of code. However I think you will need Word 2007 installed on the machine as well as the ability to Save As PDF
To convert Word documents to PDF, take a look at jWordConvert, a java library that can do exactly that. This will not work with the Excel files though, only with the Word files. The language is not Sharp, it's Java but you could switch to use IText (which is java) instead of ITextSharp.
You can also use a component like activePDF's DocConverter to convert a lot formats to PDF.
Use PDF maker that comes with adobe 7- 9
I just used this code Covert Doc to PDF
I'm surprised Aspose wasn't mentioned here, it's easy, simple, and reliable. Downside is that it is not free.
I've used iTextSharp in the past, it's really good, easy to install (one DLL I believe), the merge takes a bit of tindering so it's not as easy to use as Aspose, but hey, it's free so that is the best part.
TallPDF.NET (comes with a hefty price tag) allows you to serve dynamic PDF from any .NET application including ASP.NET pages and web services.
PDFEdit (free and open source) is an editor for manipulating PDF documents. It has a GUI version and a command-line interface. Scripting is used to a great extent in the editor and almost anything can be scripted. It is possible to create your own scripts or plugins.
The most common way to convert files to a pdf is to print them to a pdf printer driver. There are a number of such drivers, one that i know of that will do the job is Black Ice.
Another is to use Adobe Acrobat's SDK. from memory its very expensive.
Its been a while since i have actually done any work with converting pdf's and the landscape may have changed.