How to convert InDesign IDML to Tiff? - automation

I have a requirement to take idml files provided by a client, twiddle them a bit to fill in some placeholders and generate a TIFF file. This needs to happen automatically and I have InDesign Server at my disposal.
I have the first part down. I have also found how to connect to InDesign Server via SOAP and convert IDML files to hi-res PDF or low-res JPG (This implies a few other other options).
I am at a bit of a loss as to how to take it the rest of the way to generate a TIFF file, the adobe forums have not been much help. It is my impression that this sort of thing is exactly why the IDML format was introduced so I'm assuming there's decent support out there for it but the best I've been able to come up with so far is to go IDML via Indesign Server to PDF (or SVG) via Inkscape Command-line to PNG via System.Drawing to TIFF but that seems horribly contrived and fault-prone (and I have no idea how I'm going to handle multiple pages).
Any ideas?

I don't believe there is a way to export to TIFF via InDesign Server, however I did find this post on the Adobe Forums that suggests using Photoshop to render the Tiff after exporting it as a PDF from IDS. Maybe that would be an option? Otherwise maybe you could use one of the formats that you CAN export from (i.e. JPG, PDF, EPS).
Hope this helps!

For reference, I ended up using Ghostscript to achieve the results.

Related

Using Extracttext from cfpdf to get text of a pdf in coldfusion

Currently, I have a pdf that is not searchable and I am wondering what the best process is for preparing the file for coldfusion so I can index the file.
In particular, I am wondering whether a pdf file needs to be readable before using extracttext in cfpdf to pull the text from it.
I really appreciate the advice and I hope it helps other people who are interested in indexing pdf files with coldfusion.
I was considering extracting the text with Tesseract as suggested here
Performing Optical Character Recognition on PDF's from ColdFusion using a Java or .NET Library?
but if there is a built in feature in coldfusion, I would much rather use that and I think it would be more helpful to other people to know whether coldfusion can natively handle this task.

Creating Thumbnail from PDF without Adobe SDK

I've been looking for ways by which I can generate Thumbnails from pdf, as shown in the explorer. But the problem is that without Adobe Pro, the free version does not expose all ihe COM interfaces. Is there any other way? please help.
Ghostscript (which is what ImageMagick uses) will generate images in a wide variety of different image formats... if you need something really obscure then use the imagemagick wrapper, otherwise, I prefer the straight dope.
If you can afford a commercial option, you could use Amyuni PDF Creator ActiveX for this task, (or .Net version if that suits your needs better). Using this product you can create jpg/png/bmp images from the first page of your PDF files with the specified resolution, and then use them as thumbnails.
Disclaimer: I am part of the development team of this product.
Here are other SO questions proposing other approaches (not involving COM):
Using ImageMagic in command line
Thumbnail of a PDF page (Java)

HowTo extract embedded OCR data from a PDF?

I have PDF-files with embedded OCR data. (So I already orcd them) So they are searchable. Now I want to extract this OCR data, because I want to put in in my tomcat6 searchserver. For doing this, I need the plain OCR data.
So my question is, is it possible to extract this embedded OCR-Data from the pdf Files?
It would be nice to get files with coordinates. But it would also be sufficient to get plaintext files.
You should be able to do this with iText or iTextsharp. iTextsharp has 0 documentation however, and a good number of the functions are not equivalent to those found in iText.
PDFSharp does not support iref streams. Those are pretty much the only comprehensive opensource solutions. If you do not mind paying, vista solutions may have something for you, they mostly handle workflow, but they have some pretty extensive pdf libraries as well.

Can I use google API to convert a PDF into PNGs?

I have noticed that when you view PDFs in google docs the PDF viewer renders the PDF file into PNG images.
I was wondering if you could use Google Data API to upload a PDF and get the URLs of the rendered PNG files?
I have never used the google API or really had the extra time to learn it, but if it help me do this it will be well worth the extra time.
No you can't do it.
Google explicitly does not allow it.
Downloading PDFs and arbitrary files
Native PDF files cannot be exported in
a format other than .pdf.
I doubt it.. I think it would be easier/more stable to use imagemagick or some other library to handle this. Converting PDFs to images usign imagemagick is just one CLI command.

Anyway to automatically convert DWF to PDF?

Our eTendering solution, www.monaqasat.com, currently works exclusively with PDF documents for various reasons, some of them being security. We are being asked if we can support DWF documents. For this to happen, we would need to find a way to automatically convert DWF documents to PDF, using some kind of Unix application.
Does anybody know any such application, preferably using Rails or Java?
Thanks,
.Karim
http://www.autodwg.com/pdf/
http://www.dwgto.com/
http://www.aidecad.com/
http://en.wikipedia.org/wiki/List_of_PDF_software
http://www.cogniview.com/convert-pdf-to-excel/category/pdf/
Suggestion would be to install a software printer call its APIs and pass dwf and get back pdf and then apply security as needed.
Autodesk has its DWF Toolkit available at
http://www.autodesk.com/dwftoolkit
It contains full source code in C++ to read & write DWF files, so it should be reasonably easy to make it run under Linux and to use a PDF library to write the output.