Displaying the contents of a PDF file on the page using Coldfusion - pdf

I have a page that is dedicated to the Standard Operating Procedures (SOP). I want this page to show the the SOP in the page with a download button above it (and for Admin an upload button). Basically I want the user to be able to read the SOP without having to download it. I have the buttons sorted and I almost have the display set, but the format is off.
The admin can upload a PDF of the current SOP. That file then gets stored and overwrites that last upload. I tried using cffile but it was unreadable no matter what charset I tried to use. Currently I am taking the file and extracting it as a .txt, then using cffile to read it to a variable that I then output to the screen. It sort of works, but the formatting is all wrong.
I know I can use cfcontent and just have the page be the PDF, but I'd rather not have to mess with adding a new page just for admins to upload new SOP files. (The way the site is built it would have to be a new page)
<cfpdf
action="extracttext"
source="D:\file_path\SOP.pdf"
overwrite="true"
honourspaces="true"
type="string"
useStructure="true"
destination="D:\file_path\SOP.txt">
<cffile
action="read"
file="D:\file_path\SOP.txt"
variable="dcnSOP">
...
<cfoutput>#dcnSOP#</cfoutput>
Basically I'm getting a block of unformatted (as in spaces and new paragraphs) text. It's the text I want, and It's on the page where I want it. But it looks terrible. It seems to just be getting rid of any new line characters and just presenting the text in a blob. Is there a better way of doing this without just having the whole page be the PDF using cfcontent?

Thanks to #Miguel-F and #Ageax for the suggestions and leading me to a question I missed on here when I was searching for the answer.
<embed src="\file_path\SOP.pdf" width="800px" height="2100px"/>
This works with every browser but Chrome (our clients will not be using mobile browsers). I know you can use Google's PDF reader to get around this, if anyone is interested in that here is an example of that given by #Script47 here:
<embed src="https://drive.google.com/viewerng/
viewer?embedded=true&url=http://example.com/the.pdf" width="500" height="375">

Related

Android camera, take picture(s) and save as multipage PDF, then upload to server via <input type="file" />

I have a webform with and want to open it on smartphone - than take pictures of some documents which need to be merged in one PDF, and on the end this file need to be uploaded to server.
My solution is to use Google Drive to upload PDF (scan) to GDrive and then somehow download this file from gdrive to server via some sort of widget (any links appreciate) installed on website.
Maybe someone have a better idea?
I know its late but my answer might help others. I also face the same challenge and implemented a custom solution based on Javascript and Since you are using web form so this solution will perfectly fits on your need.
You have to use JSPdf javascript library, JSPdf provide you pdf object in your browser and you can upload it download it and there are many other thing to play with.
First you have to initialize JSPdf object as per your requirement. I am creating PDF with page size width:500px and height 500px.
pdf = new jsPDF("l", "pt", [500,500]);
Simply when you will take picture from camera you will have each picture in form of base64, that base64 format you have to insert in JSPdf object
pdf.addImage(imgData, 'JPEG', 0, 0);
you can repeat the above code to add pictures from camera as much as you want, at the back-end these images are compiling and creating pdf document where each page have each images in sequence.
Once you are done, you can get PDF object in form of base64 object using below code that you can upload to any server.
pdf.output('datauristring')
above is only pdf part, you can find complete working example including camera part here Javascript Component to Scan Document

Apache FOP - Scrolling in PDF possible?

I'm using Apache FOP to generate a PDF through XML and XSL-FO. I have a cell in my generated PDF that I need to be able to scroll through if the content overflows it. XSL-FO has an overflow="scroll" feature, but based on my research on the topic it seems that Apache FOP does not support this option.
For example, here is a scrollable region in a PDF used by a large CAD company that I need to replicate:
Is there any way to enable this feature in Apache FOP? Is it possible to enable it in the source code (I haven't been able to find a way to do so)? Any other ways to tackle this issue?
No, it isn't possible.
From the FO perspective:
In the XSL-FO Recommendation the scroll value for the property overflow comes from the corresponding CSS2 definition, which includes this clarification:
When this value is specified and the target medium is "print", overflowing content should be printed.
As the PDF output is a print-oriented medium, I read this as a confirmation that FOP is correct in printing the overflowing content.
From the PDF perspective:
In the PDF Reference 6th edition, a search for the word "scroll" returns results referring either to the scrolling bars in the user interface or in interactive forms (text fields, list boxes, combo boxes).
There is not, or at least I could not find it, a "static text object, but with scrolling bars" feature (which is probably sensible for a print-oriented format), so FOP cannot create it in the PDF output file, not even modifying the source code.
A second look at your comment and the screenshot you included made me think it could be an example of the 3D Artwork feature of the PDF format, a feature I didn't know of before (and I still know nothing besides its name). According to the reference:
Specific views of 3D artwork can be specified, including a default view that is displayed initially and other views that can be selected. Views can have names that can be presented in a user interface.
So, I think your screenshot shows the different views associated to a 3D object; it is not a general-purpose feature that could be used to provide scrollable text.
Well, it could be possible ...
It is possible but as far as I know not with Apache FOP. Without seeing the PDF in question and guessing from the screen shot, it looks like a Flash widget inserted into the PDF. This in PDF terms is a RichMedia annotation (requires PDF version 1.7 with extensions) in which you can insert the Flash widget as well as other controlling files (like XML, other images to display, etc.) and relate them together.
AFAIK, only RenderX XEP (whom I work for) supports such RichMedia annotations inserted into PDF via XSL FO through the rx:rich-media-object extension documented here: http://www.renderx.com/reference.html#Rich Media
I believe, the only viewer that supports PDF with RichMedia annotations is Adobe Reader so it is required to view such a file. Here is a sample that includes a few interactive flash widgets, some interactive charts all within a few page PDF that was generated long ago. NOTE: I am sure some of the links in the document do not go anywhere, it was for a trade show many years ago. Remember, you would need to download this file and view in Adobe Reader and have flash player installed to see it function.
http://www.cloudformatter.com/Resources/Samples/RichMedia.pdf
You cannot use common PDF browser-based viewers like Chrome or Firefox as they do not support this type of annotation.
A screenshot of page one here shows an interactive, scrolling widget. Page 4 contains a widget similar to what you show in your example.
Page 4 scrolling widget very similar to your request:
The widget on the last page is created using a scroller SWF that takes parameters that are the images and setup/configuration files that are XML. The RenderX extension object takes these as parameters and embeds all of them in the document for the interactive flash widget so that it is totally self-contai9ned in the PDF. The XSL FO to do this is:
<rx:rich-media-object name="Sample HTML Widget" scaling="non-uniform" width="611.92pt"
height="74.99pt" content-width="scale-to-fit" src="url('rx-scroller\dockmenu.swf')"
transparency="true" activate-condition="page_visible">
<rx:flash-var name="setupXML" value="rx-dock-settings.xml"/>
<rx:flash-var name="contentXML" value="rx-dock-contents.xml"/>
<rx:rich-media-resource name="rx-dock-settings.xml"
src="url('rx-scroller\rx-dock-settings.xml')"/>
<rx:rich-media-resource name="rx-dock-contents.xml"
src="url('rx-scroller\rx-dock-contents.xml')"/>
<rx:rich-media-resource name="style.css" src="url('rx-scroller\css\style.css')"/>
<rx:rich-media-resource name="customer1.png" src="url('rx-scroller\images\customer1.png')"/>
<rx:rich-media-resource name="customer2.png" src="url('rx-scroller\images\customer2.png')"/>
<rx:rich-media-resource name="customer3.png" src="url('rx-scroller\images\customer3.png')"/>
<rx:rich-media-resource name="customer4.png" src="url('rx-scroller\images\customer4.png')"/>
<rx:rich-media-resource name="customer5.png" src="url('rx-scroller\images\customer5.png')"/>
<rx:rich-media-resource name="customer6.png" src="url('rx-scroller\images\customer6.png')"/>
</rx:rich-media-object>
And note that many things that are in the flash would work, like links and such. It is just a pure, interactive flash inserted into PDF as the container.
Indeed it looks like this is not possible to achieve through FOP.
Continuing to dig around for a few days, however, I did find a clever post-processing alternative that is also free, essentially embedding a PDF inside of another PDF using the LaTeX animate package.
A drawback to this method is that it is not possible to embed links inside of the scrollable region, which is a major issue for me. But the method does enable inserting a scrollable region inside of an existing PDF and got me very close to what I was trying to achieve.

XPage - Open scans in browser

I need to display uploaded scans (JPG, PNG, TIFF, PDF, etc.) in the browser's window instead downloading them to a local pc and using external apps like Acrobat Reader.
I made some research in the web on that issue but wasn't really successful.
Does anyone have hints, code snippets, how to achieve that ?
EDIT :
Since I am not looking for a solution which supports viewing scans in a typical browser like Chrome, FireFox, etc. but supports viewing scans in an XPage view within Notes I need to ask my question again.
What is the best (recommended) way to view different types of scans, uploaded as PDF, JPG, TIFF, PNG, etc., in Notes within an XPage view ?
Take a look here, XPages: Embed PDF and possibly Office files
Here is some code that I have in an app for PDF's.
I tried using Bumpbox, and pdf.js and while I could get them working, iframes seemed to work best for me with using normal Domino attachment urls in xpages
I am not sure if this solution is right or not, but it works well for an app I have that only has PDFs. It does work on mobile too, at least on iOS.
<iframe
src="#{javascript:
var url = 'https://app.nsf/';
var doc = sessionScope.docID;
var atname = #RightBack(sessionScope.aname,'Body');
var end = '/$file'+atname;
return url+doc+end}"
width="800" height="1000">
</iframe>
If you are looking at using different file types you need to use a renderer, give it the attachment URL, and then display what the renderer returns with. I haven't looked at this in a while so things might have changed. Look for a lightbox clone that can display pdf. I think Orangebox was one, bumpbox looks to not be updated but I was able to get that working for me.
This method will display everything inline. I would love to see some type of renderer like pdf.js for xpages.

In any web site, the image always downloaded in the background, right?

Just to confirm, the image always downloaded in another thread which is different with the page text loading thread??
I put in my page, refer to a image on internet, the all text always show up firstly.
What do you think?
I think that html file contains all the prose and refers to pictures, so in whatever threads you do that you first download the text. Whether it's rendered before pictures are downloaded is up to UA and they may or may not be the same in this respect.
Depends on the Browser and the website. In most cases the Browser loads the "main html" where there are references to the Pics and other things.
If the Website loads most of the text-content via AJAX it could be kind of the other way round.
.. but in most cases you are right

Screen Scraping with HTTP Headers Issue - I Think

I've been trying to figure this one out for about a week now and just
can't come up with a good solution. So, I figured I would see if anyone could help me out. Here's one of the links that I'm trying to scrape:
http://content.lib.washington.edu/cdm4/item_viewer.php?CISOROOT=/alaskawcanada&CISOPTR=491&CISOBOX=1&REC=4
I right-clicked to copy image location.
This is the link that is copied:
(Can't paste this as a link because I'm new)
http:// content (dot) lib (dot) washington (dot) edu/cgi-bin/getimage.exe?CISOROOT=/alaskawcanada&CISOPTR=491&DMSCALE=100.00000&DMWIDTH=802&DMHEIGHT=657.890625&DMX=0&DMY=0&DMTEXT=%20NA3050%20%09AWC0644%20AWC0388%20AWC0074%20AWC0575&REC=4&DMTHUMB=0&DMROTATE=0
There is no clear image URL being displayed. Obviously that's
because the image is hidden behind some type of script. Through trial and
error I found that I can put ".jpg" after the "CISOPTR=491" and then the link becomes an Image URL. The problem is that this is not the high-resolution version of the image. To get to the
high-resolution version I have to change the URL even more. I found a lot of articles #Stackoverflow.com to mention trying to build a script using curl and PHP, I have even tried a few of them with no luck. "491" is the image number and I can change that number to find other images in the same directory. So, scraping a sequence of numbers should be pretty easy. But I'm still a noob at scraping and this one is kicking my butt. Here's what I've tried.
Get remote image using cURL then resample
also tried this.
http://psung.blogspot.com/2008/06/using-wget-or-curl-to-download-web.html
I also have Outwit Hub, and Site Sucker, but they don't recognize the URL as an image file and fo they just pass right ove it. I used SiteSucker overnight and it download 40,000 files and only 60 were jpegs, none of which were the ones I wanted.
The other thing I keep running into, is the files I have been able to download manually, the filename is always either getfile.exe or showfile.exe and then if I manually add ".jpg" as the extension I can view the image locally.
How can I reached the original high-res image file, and automate the download process so that I can scrape a couple hundred of these images?
I right-clicked to copy image location. This is the link that is
copied:
You noticed the title has ".exe" in there. Look at the stuff in the query string:
DMSCALE=100.00000
DMWIDTH=802
DMHEIGHT=657.890625
DMX=0
DMY=0
DMTEXT=%20NA3050%20%09AWC0644%20AWC0388%20AWC0074%20AWC0575
REC=4
DMTHUMB=0
DMROTATE=0
Strongly implies the original source of this image is in a database or something and it is being passed thru a server-side filter (not sure if that is what you meant by "some kind of script"). Ie, this is dynamically generated content, not static, and the same caveats apply as would to dynamic text content: you have to figure out what instructions to provide the server to get it to cough up what you want. Which you pretty much have in front of you...if SiteSucker or whatever won't deal with it properly, scrape the address yourself using an HTML parser.