How to force Google CSE to search just uploaded PDF file - pdf

I am using free Google custom search engine in my website to search mostly PDF and HTML files. It works pretty much well, however it doesn't search PDF file just uploaded. Is it due to that Google bot needs sometime to crawl? Is there anyway we can allow to search instantly?
The website is : http://benchmarkinc.com.au/

Do you need it to automatically index newly uploaded PDFs or are you able to manually add those PDFs to the CSE after they've been uploaded?
You can add specific files to your index from the CSE control panel -> Admin - Indexing tab and then add the URLs for the PDFs you want Google to index.
Apparently, it takes up to 24 hours for Google to index the PDF.

Related

Fetch a certain page of a pdf as an image from google drive

I have some 10,000 pages of hand-written scanned documents in google drive in somewhere around 70 pdf documents.
I am making a spreadsheet index of these, with one row for each page where I make notes of what is on each page, by actually viewing those pages, reading it, and every fully typing it if required.
I need a link, which I can put in the spreadsheet, which when clicked opens up a certain page of the pdf as an image only, and not the entire pdf, the pdf is in google drive. Is there something like this possible in Google Drive? Or should I manually download all pdf, split it into images, and then re-upload and use that?
(example - java -jar pdfbox-app.jar PDFToImage -format jpg -quality 0.75 pdffile.pdf ; and then upload all this)
I have a feeling it must be possible because when we open the pdf in browser, it loads pdf pages one by one, it takes time but it opens it in some custom image+text format, so it must be exported. Also I know there is one image version for each google slide and link is stable, so there might be something for pdf also I was thinking.
There isn’t a parameter or feature to link a pdf page in Google Drive file viewer.
Indeed as mentioned, you can link to a specific slide in Google Slides, however Google file types do have additional features.
That’s not the case for PDFs for example. A workaround I can think of would be to create a comment for each page and each comment will have its own id.
After creating the comment, you can click on the three vertical dots icon and click on Link to this comment.
Alternatively, you can send feedback to Google (On file viewer page, click on three vertical dots icon and then Send feedback to Google) making sure to describe the proposed feature.

Load The Pages From The PDF File One By One

I have a PDF file(E book) with 19 MB contain 52 pages. I have already inserted and display the PDF file in my web page.
<embed src="file_name.pdf" width="800px" height="2100px" />
When i load my web page, its loading whole PDF File (19 MB). Its taking too much time to load.
I want to load the first page only while load the website. After visitor click the 2nd page, the 2nd page needs to download and visible to the visitor.
May i know its possible? or any-other method available for my need.
Thank You!
Yes, it is possible. You need to meet the following three requirements though.
PDF file needs to be saved in Linearized mode (also known as "Fast Web View").
The web server hosting the PDF file must support HTTP range request headers.
The user must be using a PDF viewer that supports reading Linearized PDF files.
If (1) is not true, then you need a tool/library that can convert non-linearized PDF files to linearized.
As for (3) unless you control what viewer the user uses, then this is out of your control. For example, at least on Windows 10, Firefox, Chrome and new Chromium Edge all support reading Linearized PDF files.

Can I use Piwik to track which pages or articles of PDF users opened?

I would like Piwik to record which pages of 8-page PDF are read by users. Is it possible?
You can only track which PDF file has been downloaded but you cannot inject HTTP-tracking-requests into the PDF file itself. You might consider putting links inside the PDF files which you would be able to track again as soon as they are clicked.

Google Script for PDF image/screenshot import into Google Doc

I'm trying to write a script that takes images from a PDF file and puts them into a Google Doc Template. The PDF has a bunch of images in it, one image per page. What I want to do is grab the images one at a time (or take a screenshot of the entire page) and paste it into a new document. How can I access a single image at a time to import it into the google doc?
Thanks.
There is probably no way to play directly with a pdf file but you should consider to first convert these pdf to google docs and, in a second time, use the document class that can retrieve images inside the doc quite easily...

Viewing pdf large files online as thumbnail

I need to upload a pdf file in my website so that users can view/download it. But it is of 30MB file. So when the user clicks on that downloadable link it takes much time to download. So I was just thinking is there any other solution for this so that users can atleast view and read the contents of the file without downloading it and if required they can download it. ??
one suggestion is like onclick of downloadable link one new link will open where user can view pages individual as thumbnail and on click it will zoom and below we will give download link(as in google ebooks). But I was just wondering how this can be done?? Is it possible using jquery/ajax?? Please give your thoughts on this and any inputs will be highly appreciated.
You should be able to upload your PDF at http://view.samurajdata.se and then include the viewer for your PDF in an iframe.
If you like, you can even take the viewer link (something like http://view.samurajdata.se/psview.php?id=YOUR_PDF_ID), and get URLs for the individual pages as GIFs like http://view.samurajdata.se/rsc/YOUR_PDF_ID/tmp1.gif for page 1, tmp2.gif for page 2, etc. Then you can embed those GIFs how you want and customize the look and feel of your viewer.