Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have a pdf file that I want to make responsive so as to view it in desktops as well as mobiles. Responsive in the sense that it should not only fit the page based on the device size, but also the content i.e. text, images inside the pdf should also be responsive when viewing on a mobile. Just like the image shown below, pdf content should be aligned based on the device. Is there any API or library to achieve this.
Thanks in advance. Please help me to achieve this.
As indicated by other answers, PDFs primary function was to be a visual representation of content and visual representation should typically be identical across different platforms / readers / devices. That was the goal of the file format and it's diametrically opposed to file formats such as XML that are all about structure.
However, in recent years PDF did get additional functionality that may help with this. PDF files now support tagging and the purpose of tagging is to add structure to the file. A PDF file that is properly tagged does know where paragraphs of text are, what are headers, what are lists etc... And that information in theory can be used to support (limited) responsiveness.
For example, see the link here (https://helpx.adobe.com/acrobat/using/reading-pdfs-reflow-accessibility-features.html) where Adobe explains how the reflow view in Acrobat Pro works. It states that Acrobat can use the tagging structure inside a PDF file (or even automatically create some semblance of tagging on the fly for documents that are not tagged) to give you a view of the PDF file adjusts itself to the available display size.
Whether or not this is going to work depends mostly on the reader technology you will be using on your mobile device and you should certainly not confuse the possibilities of this with full responsiveness where content is hidden, replaced, adjusted, repositioned etc... such as what you can accomplish with HTML and CSS on web sites.
But it is a start.
It cannot be done. PDF is a final layout. Unlike the web page, where you are never sure what you're getting, the whole purpose of a PDF is to look the same no matter what device, or even medium, you're accessing it from. It basically says, "there will be the phrase 'Hello, World' in this font, this point size, at these x and y coordinates". You might as well try to reflow a hardcover book to fit into your pocket better.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I recently bought an Epson scanner so I can start digitizing a mountain of documents I've accumulated over the years. I've already learned how to scan documents into PDF's. However, I want to make sure my PDF's have searchable text - I think the technical term is OCR, but I'm thoroughly confused.
I can scan files into PDF's using my scanner alone. But if I understand correctly, I can't make them OCR searchable unless I make Adobe Acrobat and/or ABBYY Fine Reader part of the workflow. (I'm using a Mac running Mavericks, by the way.)
I guess the the first thing I need to ask is this: What software do I need for creating a PDF that's OCR searchable? Like I said, I already have the Epson scanner software installed, but it looks like I also need Acrobat and/or ABBYY Fine Reader.
I guess a second question I should ask is how do I know if a PDF has searchable text? Could I simply search for a word or phrase on a PDF page with a standard program like Dreamweaver or Apple's Spotlight? Thanks.
The scanner produces an image and saves it either in an image format or as PDF. Then you open the result in OCR software, such as ABBYY Fine Reader. You can also open it in Acrobat, as Acrobat itself has OCR components built in. If you were using Acrobat, you have a searchable document, unless Acrobat was unable to locate any readable character. Other OCR software may save a PDF, or another file format.
Another product has been mentioned in another answer; I don't know it, but it might be worthwhile having a look at it.
For the second question:
a) There is an Acrobat JavaScript Doc object method getPageNumWords(); if this methods returns a number greater than 0, the page you passed as argument has searchable text. You find more information about this method in the Acrobat JavaScript documentation, which is part of the Acrobat SDK, downloadable from the Adobe website.
b) There is a preflight check which finds out whether the page/document has Text objects. If so, it has searchable text. You will need Acrobat Pro, for this, however.
You can scan to multiple-page TIFF image and let Tesseract 3.03 create searchable PDF for you.
Most solutions are to use the scanner to generate an image file (like a nonsearchable PDF), then to move your body from your scanner over to your computer, log in, run some unwieldy outrageously priced software called ABBSGDS or something, click a ton of menu buttons, respond to a ton of dialogue boxes, twiddle your thumbs as you watch the OCR progress bar, and voila--a searchable PDF.
Or, you can get a Canon scanner (e.g. DR-M160) and use their free CaptureOnTouch software. In that case, you put a document in the scanner, choose a number on the scanner, and press scan. A few seconds later (even on a slow computer) a fully OCRd searchable PDF will be in the directory programmed to the number you selected. You never even have to touch your computer (although it must be on, of course)
Anything else is, in my opinion, utterly worthless for a busy office environment where you are scanning dozens of multi-page documents per day. I, e.g., stand by my scanner dropping in document after document in rapid succession. I never go to my computer, and all of my documents are searchable PDFs just about as fast as I can drop them in.
If anyone knows of a software solution with that kind of workflow only that works with general scanners, please let me know. I just made the mistake of buying a Lexmark multifunction that, since it came with ABBYYwhatever software is, effectively, a unifunction.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I am working on a project that currently uses a .tiff, compares the defined template document to the document in question. We are moving away from the .tiff format for a variety of reasons but mainly because the new files will be coming in the format of PDF.
I see two potential solutions to the issue. First convert the PDF to a tiff and use the existing code.
Or second, use a PDF library that will compare the template PDF to the PDF that is received.
Because the PDF that is received will basically come from an outside source we won’t know for sure if it is text based or image based so the library or tool will have to be able to compare both.
Any suggestions on tools/libraries you have found helpful would be great!
Thank you in advance!
dj
How about i-net PDFC - it does a full content comparison - text, images, lines, header/footer-detection and so on. You can use it either on command line or with a GUI (2.0, currently in public beta-phase) or via API (I think we have an internal version being a .NET library).
Disclaimer: Yep, I work for the company who made this - so feedback highly appreciated.
What we ended up doing was using the Aspose.Pdf library.
I ended up learning there are two types of PDFs:
Image based and
Text based
I did not have any issues comparing the Text based PDFs. However, at the point that a image based PDF was received converting the PDF to a .tiff so that we could use Microsoft's MODI to compare the PDF against our specified template. The .tiff would be a blank image rather than the actual content of the PDF. Aspose.Pdf library did cost some money, however in the end, the library did exactly what we needed it to and it allowed us to meet our client's needs.
I think your method of comparing tiffs is the way to go, using ImageMagick or other library?
Converting PDF to images can also be done via ImageMagick with the help of Ghostscript.
http://www.imagemagick.org/script/compare.php
I have a C# wrapper for GhostScript that may help, sent me a mail (on profile) and I can send it to you.
As far as I can see from your question, you want visual comparison of 2 PDFs, not structural comparison. (Because I can create you a thousand different PDF pages, which will have different internal structures and PDF source code, but will render identically on screen or on paper.)
In this case any comparison software will have to transform the 2 PDFs into raster images and compare those.
But since you have your own code already to do that for TIFFs, you can as well re-use it for PDFs (like you are considering already) which you convert to TIFFs.
Unless you find another, external tool that is better, faster, more precise, more funky, less resource-hungry... than your own solution! -- But that one will not be able to avoid converting the PDF pages to some sort of raster image before it can start the real visual comparison. (This may happen internally and unnoticeable for the user, but nevertheless it will have to take place...)
Docotic.Pdf library can compare PDF documents for you.
Please have a look at Check that two PDF documents are equal sample.
We use this feature for regression testing of the library itself (yes, I am part of the library's dev team).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need to show an image that is 4000x6000 px. What are your experiences with displaying large images online?
My initial idea was to use the GMap cutter and the Google Maps API to show the image. GMap Cutter takes an image and cuts it appropriately for use as a google map. My problem with approach is that the image will be changing often, and so I'll need to re-cut the image often. GMap Cutter doesn't have a command-line version, so I can't write a cron job for this...I'll need to do it manually every hour or so. Is there a better option for doing this?
Or any other solutions I can consider?
I'd recommend slicing it up with ImageMagick. It allows cropping via a view port and is fairly configurable. In addition it works fairly well from a Linux/Windows command prompt (or within Perl,PHP or Python using exec).
http://www.imagemagick.org/Usage/
Here's the documentation for the actual tile crop functionality.
http://www.imagemagick.org/Usage/crop/#crop_tile
With options to crop using gravity, the ability to adjust the size/quality of the output image and support for virtually every common image format ImageMagick should get you done.
The GDAL2Tiles utility, which is a part of GDAL, can be easily automated via command-line.
There is also a GUI for it, called MapTiler. They have some info on running GDAL2Tiles here.
zoom.it is a relatively straight forward way to display large images online.
Check out https://github.com/Murtnowski/GMap-JSlicer
slicer = new JSlicer(document.getElementById('map'), 'myImage.png');
slicer.init();
It's super simple.
You can see if in the wild at http://gvgmaps.com/ffvi/
If you need to update the map just change myImage.png then reload the page.
Scrollable div/iframe/pop up window?
Could use something like Mootools Scrollable to scroll the image around easily?
Or are you more worried about downloading an image that size in one go?
how about using something like this:
http://www.zoomify.com/
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
We would like to prepare the software usage documentation for a web application. This mainly contains the screen shots ( along with relevant documentation ) in most of the pages. Also we would like to have a top menu links using which we can jump to the corresponding pages.
Please suggest the tools which can be useful to fulfil the above requirements.
Dr Explain is pretty nice.
http://www.drexplain.com/
It will analyze your page and create a list of controls, buttons, etc. that need callouts.
If you write your documentation with a particular documentation tool in mind, outputting to HTML is relatively simple. LaTex, markdown (with Pandoc in mind), reStructuredText (with Pandoc or Sphinx in mind), AsciiDoc (with DocBook tools in mind) and DocBook (with Docbook tools in mind--see Pandoc).
All of those formats would allow you to easily organize your documentation then export them into HTML however you deem appropriate (probably by major heading, then build a simplistic wrapper around the files). Sphinx can also just output web-based documentation (see Python.org's documentation).
For screenshots I suggest using a standalone app on your platform of choice, Ideally one that lets you do annotation within the program. Skitch for Mac, Jing for Windows, Shutter shutter-project.org or Jing in linux.
Finally I would suggest also doing screencasts as they can be especially helpful to show off the interestingness/power of a web-app.
This may be overkill for your project, but I've favored preparing documentation in docbook (xml), since it's fantastically portable/convertable.
To simplify the document creation, you could turn to http://www.oxygenxml.com/, but you can also do the same work in just about any other xml (or even text) editor.
Once your document is prepared, it's trivial to generate html (multi-page, or single page), and pdf versions.
I don't know what king of language you are writing your code, but in the case of Java, you can use Maven.
With maven you can use many plugins, like JavaDoc, site that create a site with many informations about your API/software and contains the top menu that you want.
This is a screenshot of the site that maven generates: link
I hope these could help!
Cheers
I'm sure there's some awesome tool out there that integrates everything needed for usage documentation but I'll tell you what I use!
I use wink to grab screenshots of the application in use. I tend to fire it up and just grab loads of screenshots as I walk through the application, or even just a part of the application. Next, I edit the project in wink to remove redundant screen captures, re-order them and position the mouse on each frame. I then add highlighting which is usually just a nice box around the part of the screen I am demonstrating. Wink allows you to overlay the images with informational boxes and arrows, I then export the project as html and use the numbered, exported png images as the base for my documentation.
I tend to drag them into OpenOffice Writer (or whatever you are using for typesetting) and supplement them with more information - ie a few paragraphs top explain what the user is doing and why.
We use acrobat to output this documentation and providing your table of contents is done properly, it can insert bookmarks in the pdf to enable jumping to relevant sections.
The main benefit we get from wink is that it is very easy to re-grab shots when things change and it can output to flash to provide nice, snazzy demos of small pieces of functionality for posting on the web.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Can anyone recommend a good Windows form component for displaying PDF documents and allowing users to add real annotation (by which I mean identical to that created by Adobe Reader).
Update: I've tried the AxAcroPDF component which Abobe installs alongside Reader, but this doesn't support annotation. I basically want AxAcroPDF combined with Reader's "Comment & Markup Toolbar". It seems that the Foxit SDK ActiveX supports this, so I'm going to try that. I just thought that there would be some more alternatives to choose from.
There's also http://a.nnotate.com which you can use as a PDF / Word annotation component in web applications - just uses AJAX / JS / HTML and displays the pdfs properly in the browser without needing adobe reader. (see http://a.nnotate.com/embed-guide.html for a working demo)
For editing the documents I have worked with SyncFusions Essential PDF and it worked quite well
The free version Foxit Reader does this, you can do Tools->Commenting Tools->Note, then click anywhere on the page of the PDF to place a little note icon which has text inside. Then just save the PDF. Later, if someone views the PDF in Acrobat or Foxit, just hover the mouse over or click on the little note icons on the page to view the comments.
If anyone's interested, it looks like we'll end up using jPDFNotes, from Qoppa Software.
To quote from the web site:
jPDFNotes is a Java™ bean that
integrates into your application to
display PDF documents and forms and
allow your users to annotate the
documents and fill the forms. After
editing documents, the library can
save them to a local file or the host
application can override the save
function to save the file to any
location locally or on a network.
jPDFNotes is built on top of Qoppa's
proprietary PDF technology so your
users do not have to install Acrobat
Reader or any other third party
software or drivers. jPDFNotes is 100%
Java so it is completely platform
independent and so can run on Windows,
Linux, Unix, Mac OSX and any other
platform that supports the Java
runtime environment.
It's not what we started looking for, but it seems to be exactly what we need. They seem a nice bunch of people too.