How does google make make those awesome PDF reports in Analytics and when you print a Google Doc etc? [closed] - pdf

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
When you print from Google Docs (using the "print" link, not File/Print) you end up printing a nicely formated PDF file instead of relying on the print engine of the browser. Same is true for some of the reports in Google Analytics . . . the printed reports as PDF's are beautiful. How do they do that? I can't imagine they use something like Adobe Acrobat to facilitate it but maybe they do. I've seen some expensive HTML to PDF converters online from time to time but have never tired it. Any thoughts?

If you are specifically looking at how Google does it. If you look at the PDF Properties page, they use Prince 6.0 (see princexml.com)
There are lots of other PDF generators out there. I've had great success with PDFlib for tricky jobs.

iTextSharp and iText are opensource and free PDF generation libraries for .NET and Java respectively.
I've used them to generate report PDF's before and was quite happy with the results.
http://itextsharp.sourceforge.net/
http://www.lowagie.com/iText/

Great free alternative to PrinceXML: wkhtmltopdf . There are plenty of wrapper libraries for various languages - but I've only used Ruby ones. However the product itseld is on par with PrinceXML IMHO.

I have had success with pd4ml. It has a tag library, so you can turn any existing HTML into PDF by
<pd4ml:transform>
<!-- Your HTML is here -->
<c:import url="/page.html" />
</pd4ml:transform>

Well, I doubt it's as easy as generating HTML . . . I mean, first of all, PDF is not a human readable format and it's not plain text (like SVG). In fact, I would compare a SVG file to a PDF file in that with both you have precise control over the layout on a printed page. But SVG is different in that it's XML (and also in that it's not supported completely in the browser . . . still looking into SVG too). Come to think of it, SVG should probably will be my next question.
I know Google doesn't use .NET and I doubt they use Java so there must be some other libraries they use for generating the PDF files. More importantly, how do they create the PDF's without having to rewrite everything as a PDF instead of as HTML? I mean, there has to be some shared code for between when they generate the HTML view as opposed to the PDF view. Come to think of it, maybe the PDF view and the HTML view are completely separate and they just have two views and hence why the MVC development style seems to be the way to go.

Rendering a PDF is hard, complex problem. However generating them, is not. Simply make up some entities, and generate. It's about same problem domain as generating HTML for webpage vs. displaying (rendering) it.

Related

How to make a PDF responsive [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have a pdf file that I want to make responsive so as to view it in desktops as well as mobiles. Responsive in the sense that it should not only fit the page based on the device size, but also the content i.e. text, images inside the pdf should also be responsive when viewing on a mobile. Just like the image shown below, pdf content should be aligned based on the device. Is there any API or library to achieve this.
Thanks in advance. Please help me to achieve this.
As indicated by other answers, PDFs primary function was to be a visual representation of content and visual representation should typically be identical across different platforms / readers / devices. That was the goal of the file format and it's diametrically opposed to file formats such as XML that are all about structure.
However, in recent years PDF did get additional functionality that may help with this. PDF files now support tagging and the purpose of tagging is to add structure to the file. A PDF file that is properly tagged does know where paragraphs of text are, what are headers, what are lists etc... And that information in theory can be used to support (limited) responsiveness.
For example, see the link here (https://helpx.adobe.com/acrobat/using/reading-pdfs-reflow-accessibility-features.html) where Adobe explains how the reflow view in Acrobat Pro works. It states that Acrobat can use the tagging structure inside a PDF file (or even automatically create some semblance of tagging on the fly for documents that are not tagged) to give you a view of the PDF file adjusts itself to the available display size.
Whether or not this is going to work depends mostly on the reader technology you will be using on your mobile device and you should certainly not confuse the possibilities of this with full responsiveness where content is hidden, replaced, adjusted, repositioned etc... such as what you can accomplish with HTML and CSS on web sites.
But it is a start.
It cannot be done. PDF is a final layout. Unlike the web page, where you are never sure what you're getting, the whole purpose of a PDF is to look the same no matter what device, or even medium, you're accessing it from. It basically says, "there will be the phrase 'Hello, World' in this font, this point size, at these x and y coordinates". You might as well try to reflow a hardcover book to fit into your pocket better.

.Net Tool or Library to compare one PDF to another PDF [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I am working on a project that currently uses a .tiff, compares the defined template document to the document in question. We are moving away from the .tiff format for a variety of reasons but mainly because the new files will be coming in the format of PDF.
I see two potential solutions to the issue. First convert the PDF to a tiff and use the existing code.
Or second, use a PDF library that will compare the template PDF to the PDF that is received.
Because the PDF that is received will basically come from an outside source we won’t know for sure if it is text based or image based so the library or tool will have to be able to compare both.
Any suggestions on tools/libraries you have found helpful would be great!
Thank you in advance!
dj
How about i-net PDFC - it does a full content comparison - text, images, lines, header/footer-detection and so on. You can use it either on command line or with a GUI (2.0, currently in public beta-phase) or via API (I think we have an internal version being a .NET library).
Disclaimer: Yep, I work for the company who made this - so feedback highly appreciated.
What we ended up doing was using the Aspose.Pdf library.
I ended up learning there are two types of PDFs:
Image based and
Text based
I did not have any issues comparing the Text based PDFs. However, at the point that a image based PDF was received converting the PDF to a .tiff so that we could use Microsoft's MODI to compare the PDF against our specified template. The .tiff would be a blank image rather than the actual content of the PDF. Aspose.Pdf library did cost some money, however in the end, the library did exactly what we needed it to and it allowed us to meet our client's needs.
I think your method of comparing tiffs is the way to go, using ImageMagick or other library?
Converting PDF to images can also be done via ImageMagick with the help of Ghostscript.
http://www.imagemagick.org/script/compare.php
I have a C# wrapper for GhostScript that may help, sent me a mail (on profile) and I can send it to you.
As far as I can see from your question, you want visual comparison of 2 PDFs, not structural comparison. (Because I can create you a thousand different PDF pages, which will have different internal structures and PDF source code, but will render identically on screen or on paper.)
In this case any comparison software will have to transform the 2 PDFs into raster images and compare those.
But since you have your own code already to do that for TIFFs, you can as well re-use it for PDFs (like you are considering already) which you convert to TIFFs.
Unless you find another, external tool that is better, faster, more precise, more funky, less resource-hungry... than your own solution! -- But that one will not be able to avoid converting the PDF pages to some sort of raster image before it can start the real visual comparison. (This may happen internally and unnoticeable for the user, but nevertheless it will have to take place...)
Docotic.Pdf library can compare PDF documents for you.
Please have a look at Check that two PDF documents are equal sample.
We use this feature for regression testing of the library itself (yes, I am part of the library's dev team).

Need to choose a suitable language to write documentation in [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Currently the documentation where I work is in a bit of a state. There isn't anywhere near enough of it, and the documentation that does exist is spread out over many word documents making it hard to find anything.
I'm trying to take some initiative and get it improved, and I figure the first thing is to find a better format to write the documentation in:
My thoughts are that the documentation should be structured in a series of short articles (MSDN / Html Help style) and structured in a suitable tree:
It would be good to be able to produce a standalone Html-Help style package to be shipped with the application
As well as being able to produce a MSDN-style website as a reference for those who are too lazy to look at the CD.
Search is of course a must-have
It needs to be at least reasonably easy to update - if there is a 17 step process to update the published documentation then it makes it seem like too much work to do simple changes, and nobody can ever be bothered to update it.
The documentation is technical in nature, and so ideally it would be nice to be able to include generated documentation from things like the Xml documentation embedded in C# code. This is however definitely a side-requirement - currently very little useful Xml documentation exists, its just that in the future I plan to fix that.
For the same reason it is often good to be able to handle things like attachments (code samples etc...) I'm not expecting anything fancy, but this is something I need to bear in mind to make sure that its at least not handled badly.
Are there any projects or languages that are suited to this sort of documentation?
I've had good results with doxygen on my C and C++ projects although it supports many other languages as well. You put the documentation in comments in the code that can be simple or complex HTML markup. It is very easy to update as it is part of the code. You can make building the documents part of your build process. Additional topic that are not strictly API related can be added as separate HTML documents. The version I'm using doesn't support search so you would have to add another product to search these pages. Because it is HTML you can add in code samples, diagrams, etc.
If you use LaTeX you can get all your documentation in great looking PDFs and printed copies, as well as being able to generate html (via latex2html). TeX has the advantage of being all plaintext, too, so you can track/merge it reliably with your favourite revision control system.
We use confluence as our documentation repository. It is fairly easy to have public and private sections, and has a nice WYSIWYG editor. It can handle attachments and can be saved off as PDF documents if you like.
I've used robohelp with good results. it is plain html, but has a generation process that keeps everthing looking consistent. It can be packaged as a .hlp file with the app, or published to the website. Check it out, it is simple so you can get back to doing your job :)
A clean way is to use DocBook. It is easy write and undetstand. It is also easy to parse as XML parsers are standard and other forms of documentation (e.g. from the embedded documentation in comments) can be easily be transformed to this format.
It is straightforward to generate PDF, HTML og other formats from the DocBook source (tools exist for this purpose).
I've started using DokuWiki. Its not exactly what I was originally looking for (I think I was really looking for a CMS), but it does the job and some respects its better than what I originally had in mind (in particular its a wiki - I've not yet gotten as far as publishing this to our customers however so I'm not sure how well thats going to work out)
I'm using the IndexMenu plugin and the Arctic template to get a navigation tree on the left, and if I publish the wiki itself I'll use the discussion plugin to allow users to post feedback.
Currently my method of handling generated content is to use xslt templates to produce dokuwiki syntax, and write that output directrly to files / folders in the "data/pages" folder.

Software usage documentation tools for Web applications [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
We would like to prepare the software usage documentation for a web application. This mainly contains the screen shots ( along with relevant documentation ) in most of the pages. Also we would like to have a top menu links using which we can jump to the corresponding pages.
Please suggest the tools which can be useful to fulfil the above requirements.
Dr Explain is pretty nice.
http://www.drexplain.com/
It will analyze your page and create a list of controls, buttons, etc. that need callouts.
If you write your documentation with a particular documentation tool in mind, outputting to HTML is relatively simple. LaTex, markdown (with Pandoc in mind), reStructuredText (with Pandoc or Sphinx in mind), AsciiDoc (with DocBook tools in mind) and DocBook (with Docbook tools in mind--see Pandoc).
All of those formats would allow you to easily organize your documentation then export them into HTML however you deem appropriate (probably by major heading, then build a simplistic wrapper around the files). Sphinx can also just output web-based documentation (see Python.org's documentation).
For screenshots I suggest using a standalone app on your platform of choice, Ideally one that lets you do annotation within the program. Skitch for Mac, Jing for Windows, Shutter shutter-project.org or Jing in linux.
Finally I would suggest also doing screencasts as they can be especially helpful to show off the interestingness/power of a web-app.
This may be overkill for your project, but I've favored preparing documentation in docbook (xml), since it's fantastically portable/convertable.
To simplify the document creation, you could turn to http://www.oxygenxml.com/, but you can also do the same work in just about any other xml (or even text) editor.
Once your document is prepared, it's trivial to generate html (multi-page, or single page), and pdf versions.
I don't know what king of language you are writing your code, but in the case of Java, you can use Maven.
With maven you can use many plugins, like JavaDoc, site that create a site with many informations about your API/software and contains the top menu that you want.
This is a screenshot of the site that maven generates: link
I hope these could help!
Cheers
I'm sure there's some awesome tool out there that integrates everything needed for usage documentation but I'll tell you what I use!
I use wink to grab screenshots of the application in use. I tend to fire it up and just grab loads of screenshots as I walk through the application, or even just a part of the application. Next, I edit the project in wink to remove redundant screen captures, re-order them and position the mouse on each frame. I then add highlighting which is usually just a nice box around the part of the screen I am demonstrating. Wink allows you to overlay the images with informational boxes and arrows, I then export the project as html and use the numbered, exported png images as the base for my documentation.
I tend to drag them into OpenOffice Writer (or whatever you are using for typesetting) and supplement them with more information - ie a few paragraphs top explain what the user is doing and why.
We use acrobat to output this documentation and providing your table of contents is done properly, it can insert bookmarks in the pdf to enable jumping to relevant sections.
The main benefit we get from wink is that it is very easy to re-grab shots when things change and it can output to flash to provide nice, snazzy demos of small pieces of functionality for posting on the web.

Component to view and annotate PDF documents [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Can anyone recommend a good Windows form component for displaying PDF documents and allowing users to add real annotation (by which I mean identical to that created by Adobe Reader).
Update: I've tried the AxAcroPDF component which Abobe installs alongside Reader, but this doesn't support annotation. I basically want AxAcroPDF combined with Reader's "Comment & Markup Toolbar". It seems that the Foxit SDK ActiveX supports this, so I'm going to try that. I just thought that there would be some more alternatives to choose from.
There's also http://a.nnotate.com which you can use as a PDF / Word annotation component in web applications - just uses AJAX / JS / HTML and displays the pdfs properly in the browser without needing adobe reader. (see http://a.nnotate.com/embed-guide.html for a working demo)
For editing the documents I have worked with SyncFusions Essential PDF and it worked quite well
The free version Foxit Reader does this, you can do Tools->Commenting Tools->Note, then click anywhere on the page of the PDF to place a little note icon which has text inside. Then just save the PDF. Later, if someone views the PDF in Acrobat or Foxit, just hover the mouse over or click on the little note icons on the page to view the comments.
If anyone's interested, it looks like we'll end up using jPDFNotes, from Qoppa Software.
To quote from the web site:
jPDFNotes is a Java™ bean that
integrates into your application to
display PDF documents and forms and
allow your users to annotate the
documents and fill the forms. After
editing documents, the library can
save them to a local file or the host
application can override the save
function to save the file to any
location locally or on a network.
jPDFNotes is built on top of Qoppa's
proprietary PDF technology so your
users do not have to install Acrobat
Reader or any other third party
software or drivers. jPDFNotes is 100%
Java so it is completely platform
independent and so can run on Windows,
Linux, Unix, Mac OSX and any other
platform that supports the Java
runtime environment.
It's not what we started looking for, but it seems to be exactly what we need. They seem a nice bunch of people too.