Is there a tool or trick to store a large PDF image as a file (BMP/PNG/etc)? - pdf

This is not strictly a programming question, but it's related to programming task I need to perform this in order to make an iPhone app.
I have a PDF file with a large image (say, a campus map) which I want to store as a PNG image to include as resource in the app. The image I want itself is much larger than the screen area (a lot larger, about 4000x4000 px). So I cannot just take a single screenshot of the PDF and save it as PNG. The only way I know to accomplish this is to take a number of screenshots of different parts of the image and manually stitch them together in an image editor. There will be 8-10 images to stitch together, if not more.
I wonder if anyone knows a more efficient way of doing this? Acrobat PDF reader does not allow this. Are there any tools or tricks in either Windows or MacOS I can use? Googling this did not bring anything that works.

It would also be an option to use the PDF directly, iOS has pretty good support for reading PDFs, see the ZoomingPDFViewer sample code from Apple for an example.
As for your actual question, I'm not sure if there are existing tools that do exactly what you want here (though I'd guess there are), but it would also be pretty easy to make a small Cocoa command-line tool that converts a PDF to a number of bitmap tiles using Core Graphics.

You could use Ghostscript to convert your pdf to a png.
A command like
gs -sDEVICE=png16m -r600 -o my_Map.png my_Map.pdf
would provide you a png from a pdf image.

Related

Inkscape objects lose transparency when saved as a PDF

I've seen a number of issues dealing with saving inkscape drawings as anything other than a svg but haven't seen a discussion specifically about transparent objects in PDFs. What's happening is that when I export a png any transparent object looks fine but if I save it as a PDF or eps the transparency is lost.
I've created an example which you can see at this link ( http://imgur.com/a/ieVuu )
I've looked at a lot of other posts and feel like the explanation to this is layered within the responses but I'm a beginner and can't read between the lines to understand it. I wanted to just ask why this is happening and what can be done about it directly?
Use Alpha Channel to set the transparency instead of Opacity option.
Typically this is an issue with the pdf printing program that you are using. Perhaps it's a configuration issue, perhaps it's just not advanced enough to handle it.
Try out several other programs that can print to pdf and see if you can find one that works for your use case.

Get text from a pdf in NSString

I am trying to make an iOS app which would extract plain text from a pdf file and display it in a UITextView. Its simply not a pdf reader to view a pdf file but i would later wish to perform certain operations on that text.
I have already googled a lot but still not able to get an exact solution.
i already tried using https://github.com/zachron/pdfiphone
but the files are using ARMV6 architecture which seems obsolete with xcode 4.5
And if anyone can suggest some exact and non-confusing code using Quartz-2d framework of iOS then it would be great.
Here is An Sample code to Extract text from PDF Hope this Might Help You.
https://github.com/zachron/pdfiphone
This is a library to get the text out of a PDF for the iPhone.
Another Demo is there Which uses OCR technology find the link below
https://github.com/nolanbrown/Tesseract-iPhone-Demo
Also Check this page of the Quartz 2D Programming Guide, it covers everything you need to open and parse a PDF file in iOS. Note that it is not a simple task, since there's no method to extract the full text in one line. You have to work with the data as an input stream, using a CGPDFScanner
Two Other Libraries
https://github.com/KurtCode/PDFKitten/
https://github.com/mobfarm/FastPdfKit
This question comes up all the time. It is VERY hard to extract text from PDF in general. The PDF specification is not designed with text extraction in mind. There are many libraries that try to do the job, essentially by reconstructing the text from the geometric placement of the individual glyphs. These libraries have varying degrees of success, but will all fail on certain PDF documents. In fact, some PDF documents have Glyphs but no way to associate the glyph with a character. For these documents it is simply not possible to extract text, short of using some kind of OCR approach.
PDF is designed as a read-only format that is portable in the sense that a PDF document will be rendered identically on any platform. That is what it is best at, and what it should be used for.
If text is to be edited, do not use PDF.
Here (Extracting text from pdf using objective-c), I found an answer to your question and it works. But not so fine as i need it :(
it can extract only ascii
it return me only one paragraph
Good luck.

Script to Cut Adobe Illustrator File into Tiles

I'm creating a Custom Google Map based on an image in an Adobe Illustrator file. I need to cut the file into 256px x 256px PNGs to feed into the Google Maps API.
You can write scripts to automate tasks in Illustrator using ExtendScript, a modified version of JavaScript. I found one example of a script for Photoshop that makes tiles for Google Maps (Hack #68 in this book) but I haven't figured out how to port this over to Illustrator.
The main problem is I can't figure out how to tell Illustrator to isolate 256px x 256px portions of an image. The Photoshop script does this by selecting portions of the image of that size and copying them into a new file, but as far as I know you can't do that in Illustrator.
Any ideas?
I've got no experience writing scripts for Adobe products, but since Illustrator handles vector data, the tiling algorithm is slightly different. There is a Python script for MS VisualEarth that tiles a set of GPS points (demo), maybe you can take some ideas from it.
Another choice may be to (programatically?) render .AI files to .PNG or something similar an then tile it into 256x256px tiles using that PS hack you referenced.

Is there a programmatic way to convert PDFs to images using Adobe's PDF renderer?

We trying to set up automated regression testing our generated PDFs by converting our them to images and then using Python Imaging library to test the difference, pixel by pixel, between new and old versions. Right now, the only step that isn't automated is converting the PDFs to images. I know there are ways to convert PDFs to images with other rendering engines (e.g. postscript), but since we're doing precise pixel by pixel comparisons we want to make sure that we are using Adobe's PDF renderer to generate the image. Is there a way to do this with Adobe's renderer?
Have a look at GhostScript - http://www.ghostscript.com/
Also have a look at the PDF tools from tall components - http://www.tallcomponents.com/
You can use Acrobat programatically, however this may be against their licensing and as far as I remember it was much slower than GhostScript.

Best way to generate a custom document?

I am working on generating a document for printing. It should use a specific TTF font and everything must be printed with vector graphics (for quality). Some of the text should be replaced automatically (e.g. current time). Also it should include a custom-generated EPS image with a chart.
Ideally I would like to have some kind of document template where the text could be replaced easily, and it would be nice if it could import the image through path. But I am not sure which format could be good for this. Best I can come to think of is LaTeX, but I don't like that it's a lot of manual work to use it with TTF... any other ideas?
By the way, I am using OS X...
Memoir package is very flexible for your special layouts.
Xetex uses your system fonts (Installed together with TexLive).
You could blend most of those elements to an EPS using imagemagick or gimp script-fu
There are several products out there that will build you a PDF programmatically. I've only used the Coldfusion Report Builder myself and that may not be practical/affordable for your application. If your budget allows I'd look into a commercial reporting product. I know Adobe have several that will generate Flash, FlashPaper or PDF output.