Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Can we use iText library to convert multi page PDF to multi page TIFF?
You may convert PDF to images using one of the following tools:
Ghostscript (cross platform, see this answer)
Image Magick (cross platform and based on Ghostscript, with command line convert -density 300 image.pdf image.tif)
xPDFRasterizer (cross platform, commercial, from authors of xPDF)
ByteScout PDF Renderer SDK (Windows only, commercial - Disclosure: I am related to ByteScout)
Also, you should consider the following important things when converting PDF to TIFF images:
PDF rendering is the heavy task that requires lot of memory especially if you are rendering high quality images as output (300 DPI or higher)
Generating TIFF files is also quite a heavy task so you may consider converting each page from PDF file into a separate file and then merge all pages into final TIFF file.
You may want to preserve original PDF files in addition to generated TIFF images: PDF is the actual replacement for TIFF format and is better because PDF is searchable, may contain both image representation and the original text on top, provides much lesser file size and by using embedded fonts provides scalable high quality text printing.
Related
Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 11 hours ago.
Improve this question
I truly need help in Adobe Illustrator PDF. I have many small PNG files saved from lost vector line drawings.
The problem is that PDF scales up any image file (JPEG or PNG alike) when saved in Adobe Illustrator. Although such PNG images are as big as 1920 pixels width size, they are about 50 KB. But when exporting a PDF file, the PDF file becomes almost 9 MB. Is there a way to configure the Illustrator PDF to not alter the image type or size, but rather embedding it as is? I also don't want it linked to an outside file, but rather embedded as is, without scaling it up to 9 MB.
I looked through the PDF settings, but there isn't anything to give me the option to embed images without scaling up. Even setting it to 72 ppi doesn't make a big difference.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I'm looking for a tool to display code examples in PDF file. I mean that I would like to colorize and indent code in my PDF (it's for lessons).
I'm not able to find anything on the web or on StackOverflow. It's full of tutorials to use code to make PDF but not to display code in PDF. When I search for 'display' it gives me how to display PDF in web/applications.
Sorry to disappoint you, but:
There is no such thing as you are looking for!
If you want code samples on a PDF page to be syntax highlighted, you must look for a tool that does do this within the source document which was used to generate the PDF file from.
There is no tool in the world, neither Free and Open Source Software, nor commercial payware, that lets you edit a PDF and convert the source samples on its pages into properly syntax highlighted parts. (The only thing you can possibly do on this level is adding specific comments -- here you have to manually highlight specific words or sentences with a background color of your choice.)
If you are looking for a toolchain that makes it easy to generate PDFs from scratch containing syntax-highlighted code samples, look at:
Markdown: a very lean text markup language to write the document in (use any text editor you like)
Pandoc: a powerfull Markdown-to-Anything converter. It's a command line tool available for all major OS platforms. Its output may be PDF, HTML, EPUB, LaTeX (all of the previous with syntax highlighting), as well as ODT, DOCX, DocBook (no syntax highlighting supported so far for the last few) and a few more...
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'd like to embed a PDF file viewer in a window of my planned-to-be open-source application. I don't want to release my application on GPL though, and most of open-source PDF libraries are on GPL (poppler, ghostscript, muPDF).
Is there a PDF viewer library that would be on a non-viral open-source license?
Thanks,
It seems that there is a new BSD-licensed contender: PDFium.
IANAL. Blah blah blah.
Using GhostScript by shelling out to a command line will not require you to change your licensing in any way. Batch files used to call GhostScript aren't automagically GPL'ed.
With GPL, I'd always understood that it boiled down to "Separate process? Separate license!".
So you just have GS whip up a relatively hi DPI version of the PDF page in question, and let the user pan and zoom around in that. Because GS IS in a separate process, you could fire off additional page requests in the background so the user won't perceive a delay when paging back and forth. GS takes a page range as one of its conversion parameters.
What you couldn't do is generate an image of a small part of an individual PDF page at high DPI/zoom. IIRC, you have to render the whole page.
If your application is open source and free then you should consider the option to host Adobe Reader ActiveX control (which requires to have Adobe Reader to be installed), this behavior would be the same as embedded Adobe Reader in Internet Explorer or Firefox browsers.
Lot of users have Adobe Reader or Foxit Reader installed on their computers already.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
We have a bunch of documents in our organization that were inadvertently saved as Adobe PDF packages (also known as PDF 1.7 "collections").
We would like to convert these to normal PDFs (most of these "packages" contain one bog-standard pdf file), but given the number of files, it's not possible manually.
Any Adobe expert know whether:
There is an open-source or free library that handles PDF package format that I can write a script around?
Does Adobe Pro 9 have a relevant scriptable interface that would allow me to extract the relevant file from each package?
Alternatively, I'm looking at a macro-based approach, but I'd rather not go this route until investigating other options.
Thanks!
After a bunch of digging around, I found pdftk, which is distributed as source and binary on many platforms.
It does almost all of what we need to do, and we can now iterate through our documents and recursively call pdftk on each (some are multi-level attachment chains).
Note pdftk will only burst pages of the visible document into individual documents. The hidden documents remain hidden.
The option you need to use is unpack_files.
Yet another unwanted obfuscation format to hinder interoperability therefore classified as malware.
Using Adobe Acrobat Professional combine all into one pdf and then split by bookmark level
I understand this thread is few years old but if anyone is looking for free utility to extract files from PDF packages (especially from large collections) then check the free utility ByteScout PDF Multitool: it was tested against 500+ MB package files to extract hundreds of multilevel chained attachments.
Disclaimer: I'm affiliated with ByteScout
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I want to write a tool that helps me search pdf/chm/djvu files in linux. Any pointers on how to go about it?
The major problem is reading/importing data from all these files. Can this be done with C and shell scripting?
Tracker ships with Ubuntu 8.04 -- it was a significant switch from Beagle which users believed was too resource (CPU) intensive and didn't yield good enough results. It indexes both pdf and chm and according to this bug report it also indexes djvu.
Note that djvu is an image compression format (optimized to compress 'pictures of text', typically the results of scanning). As such, you won't be able to search for text, except in the metadata -this is what the link sent by cdleary refers to-, or if you first use OCR on the document to convert it into text.
The same is true for PDFs which content are scanned articles/books.
How about a plugin for Beagle ?
It already searches PDFs but you can add other file types.
Here is the relevant wikipedia page : http://en.wikipedia.org/wiki/Beagle_(software)