How to read a pdf file by using HTML5? - api

I want to create html file by which i could read any PDF file by providing the source of that PDF file. How can i do this by using only html5?
For example i want read a pdf file which is available in C drive so scr="http://virdir/mypdf.pdf".
I want something like this.

You want to use the developing HTML 5 File API. Mozilla has a good explanation, and you can also refer directly to the spec.
Since PDF is a binary format, you will probably want to use FileReader.readAsBinaryString().
Parsing and rendering (e.g. to a canvas) a PDF in JavaScript is possible, but it would be very challenging.

Here is an open source pdf reader written in javascript.
https://github.com/mozilla/pdf.js
There are APIs available to play with. It comes built into Firefox browser and has good support from Mozilla community.

Related

Android camera, take picture(s) and save as multipage PDF, then upload to server via <input type="file" />

I have a webform with and want to open it on smartphone - than take pictures of some documents which need to be merged in one PDF, and on the end this file need to be uploaded to server.
My solution is to use Google Drive to upload PDF (scan) to GDrive and then somehow download this file from gdrive to server via some sort of widget (any links appreciate) installed on website.
Maybe someone have a better idea?
I know its late but my answer might help others. I also face the same challenge and implemented a custom solution based on Javascript and Since you are using web form so this solution will perfectly fits on your need.
You have to use JSPdf javascript library, JSPdf provide you pdf object in your browser and you can upload it download it and there are many other thing to play with.
First you have to initialize JSPdf object as per your requirement. I am creating PDF with page size width:500px and height 500px.
pdf = new jsPDF("l", "pt", [500,500]);
Simply when you will take picture from camera you will have each picture in form of base64, that base64 format you have to insert in JSPdf object
pdf.addImage(imgData, 'JPEG', 0, 0);
you can repeat the above code to add pictures from camera as much as you want, at the back-end these images are compiling and creating pdf document where each page have each images in sequence.
Once you are done, you can get PDF object in form of base64 object using below code that you can upload to any server.
pdf.output('datauristring')
above is only pdf part, you can find complete working example including camera part here Javascript Component to Scan Document

how to use pdf.js or viewerJS to display pdf in browser

i need a tuto about using pdf.js or viewejs to view pdf via browser. I found viewerjs.org but it doesn't help.
Any help thx in advance.
Try out pdf.js as it is very simple. Download the latest version here.
To show the pdf files traverse to web/viewer.html and it should load its default pdf.
As to the question about how to show your pdf's use: viewer.html?file=relative/path/to/your/pdf.
Say for example inside the web folder(the one in which viewer.html is there) of your pdf.js you create a directory say named pdfFiles and in it you add a pdf named say mypdf.pdf in it then to display it use: viewer.html?file=pdfFiles/mypdf.pdf and it will display it.
Almost all browsers are supported. Look here to know more about which all browsers are currently being supported.

HTML PDF Viewer

Is there any alternative way to view PDF files on the web instead of using Acrobat Reader? I need to control the viewer to programmatically trigger the printing of the document.
The source of the PDF should come from a webservice URL / AspX
The easiest I would think is to use the Google Doc Viewer:
<iframe src="http://docs.google.com/viewer?url=**PathToMyPdfFile.pdf**&embedded=true" width="600" height="780" style="border: none;"></iframe>
You need to host your PDF files somewhere online, may be in a file in your public website ( it needs to be a public site) and put the link to the PDF file in "PathToMyPdfFile.pdf" in the iFrame above. Then set the width and height you need.
Google even generates this code for you here:
https://docs.google.com/viewer
Then simply put this iframe anywhere in the body of your page where you want to display your PDF. This also supports many other file formats too.
There are quite a few options for document views online, some open source others proprietary. Personally, I've had good experiences with Flex Paper. This will allow you to include the document view on your website, and there are some developer resources which will allow you to integrate it with the functionality you're looking for.
For demos, see here: http://flexpaper.devaldi.com/demo/
You can use FoxIT PDF viewer. It's free and programmable.

OBML file format

Opera mini browser can save HTML pages in OBML (Opera Binary Markup Language) format for offline browsing. I am wondering if I can convert a HTML file to OBML format and save in my phone for later viewing.
For doing so, I need details about the OBML format, which seems to be undocumented. Do you know more details on this OBML format?
Thanks for your time.
I reverse-engineered several OBML format versions and wrote a Python OBML to HTML converter along with format documentation. Generating OBML files might be more difficult, however, as it requires an actual HTML layout engine (all OBML elements are pixel-positioned rects).
A working way to save/open OBML files on your PC
http://java4me.blogspot.com/2008/01/opera-mini-as-pc-browser-big-screen.html
http://my.opera.com/opera.mini/forums/topic.dml?id=234030
Opera MicroEmulator
Nothing yet about OBML.
If the HTML is small in size you can simply upload the HTML to a file host or webspace, then access it using Opera mini then save it as OBML.

How to get number of pages in a PDF document regardless of version ? Some scripting language

How I can get the number of pages in a PDF document ? The document can have images too, and text in different font size. It should work with different PDF document versions.
The answer can be in any scripting language, I will port them later to Ruby.
Using pyPdf:
from pyPdf import PdfFileReader
pdf = PdfFileReader(file("document.pdf", "rb"))
print pdf.getNumPages()
I think there must be a similar library with similar functionality for Ruby.
I can think of a band-aid solution which might just work. I am going to assume that you are developing a web application/web page which needs this information. In that case, let the adobe reader plugin for the browser load the pdf document. Then, use the plugin to attach/execute some 'Javascript for pdf' to the loaded document which will return the number of pages. The DOM for that function call can be found here:
http://www.adobe.com/devnet/acrobat/pdfs/js%5Fapi%5Freference.pdf
Further, you must also collect this information and get it back. You may also find this guide helpful:
http://www.adobe.com/devnet/acrobat/pdfs/Acro6JSGuide.pdf