Downloading PDFs without the link or .pdf extension specified in the href tag - pdf

I want to download PDFs from a website https://www.mca.gov.in/content/mca/global/en/data-and-reports/reports/monthly-information-bulletin.html which does not show the link or .pdf extension in the href tag clearly. I have loacted the tag which has the link but i am unalbe to figure out the code
for link in soup.select("a[href$='.pdf']"):
#try printing all pdf urls from the page
print (link)
How can I download these PDFs by changing these codes? Please suggest different codes if any. Thank you
I tried using the # which is mentioned in the href tag in this code
for link in soup.select("a[href$='.pdf']"):
#try printing all pdf urls from the page
print (link)
But the PDFs were not downloaded

Related

Embedding a external pdf link to my webpage without uploading the pdf file to server

I have a external link to pdf(http://du.ac.in/du/uploads/Admissions/Cut-off/2016/First/290620161st_Cut_Off_DU_1.pdf). I want to show this pdf on my webpage without uploading this file to my server. Basically i want a pdf view embeded code but with the downloadable link of pdf file.
I'm reporting part of an answer of another question, should work
I recommend checking out PDFObject which is a Javascript library to embed PDFs in HTML files. It handles browser compatibility pretty well and will most likely work on IE8.
In your HTML, you could set up a div to display the PDFs:
<div id="pdfRenderer"></div>
Then, you can have Javascript code to embed a PDF in that div:
var pdf = new PDFObject({
url: "http://du.ac.in/du/uploads/Admissions/Cut-off/2016/First/290620161st_Cut_Off_DU_1.pdf",
id: "pdfRendered",
pdfOpenParams: {
view: "FitH"
}
}).embed("pdfRenderer");

How do I render a PDF from HTML with working named anchors?

Is there a way for a bunch of named anchors in a large html to be clickable within a PhantomJs generated PDF file?
I.e. say I have a table of contents or a list of FAQ questions. When clicking on the question/title - I'm taken to its answer/content within the same HTML file which is great but when the same HTML is rendered into a PDF each named anchor becomes an absolute URL (i.e. http://example.com/render.html#anchor_1) so clicking on it opens a browser with that URL instead of jumping to its content within the PDF file.
So, basically, is it possible (and how?) for a markup like this - https://fiddle.jshell.net/jyjuaaog/ to work within the generated PDF?
BTW, this works great when "printing as a PDF file" in Google Chrome but links end up broken when rendered in PhantomJs so there must be something I'm missing that I can't seem to find in the docs.
Any ideas?
Thanks!
Apparently there's a bug in PhantomJs preventing this. As suggested by PhantomJsCloud a quick-and-dirty workaround would be to replace the links with page links.

WordPress : Open all PDF link as iframe

I have posts in my WordPress as a link for pdf like this:-
To show Doc place click here :-
http://domain.com/2012-04/item-1335086631.pdf
I need to convert this link to embed to show file in my site.
How can do that.

How do you provide download links for attached PDFs in Zotonic?

I would like to let my content authors upload PDFs and provide download links. Unfortunately, they get a page with a preview of the first page of the PDF instead of the PDF itself when they link to it.
How do you provide download links for attached PDFs in Zotonic?
The following snippet (given a RSC id of the media as pdf_rsc) will provide a download link:
Download {{ m.rsc[pdf_rsc].title }}

How to download file from inside Seam PDF

In out project we are creating a pdf by using seam pdf and storing that pdf in the database.
The user can then search up the pdf and view it in their pdf viewer. This is a small portion of the code that is generated to pdf:
<p:html>
<a:repeat var="file" value="#{attachment.files}" rowKeyVar="row">
<s:link action="#{fileHandler.downloadById()}" value="#{file.name}" >
<f:param name="fileId" value="#{file.id}"/>
</s:link>
</a:repeat>
When the pdf is rendered, a link is generated that points to:
/project/skjenkebevilling/status/status_pdf.seam?fileId=42&actionMethod=skjenkebevilling%2Fstatus%2Fstatus_pdf.xhtml%3AfileHandler.downloadById()&cid=16
As you can see this link doesnt say much, and the servletpath seems to be missing.
If I change /project with the servletpath
localhost:8080/saksapp/skjenkebevilling/status/status_pdf.seam?fileId=42&actionMethod=skjenkebevilling%2Fstatus%2Fstatus_pdf.xhtml%3AfileHandler.downloadById%28%29&cid=16
Than the download file dialog appears. So my question is, does anyone know how I can input the correct link? And why this s:link doesnt seem to work?
If I cannot do that, then I will need to somehow do search replace and edit the pdf, but that seems like a bit of a hack.
(This is running under JBoss)
Thank you for your time....
I found a workaround for this problem.
Seems I have to use s:link together with a normal a href tag.
Only having href tag doesn't work for some reason.
<s:link action="#{fileHandler.downloadById()}" value="#{file.name}" propagation="none">
<f:param name="fileId" value="#{file.id}"/>
</s:link>
<a href="#{servletPath.path}?fileId=#{file.id}&actionMethod=#{path.replace('/','')}%2Fstatus%2Fstatus_pdf.xhtml%3AfileHandler.downloadById()&">
download
</a>
The servletPath.path returns the servlet path ie http://mydomain.com/download.seam
You can decide to put login-required=true on the download.seam if you want users to login before downloading a file.
#Observer("org.jboss.seam.security.loginSuccessful")
public void servletPath() {
HttpServletRequest request = (HttpServletRequest) FacesContext.getCurrentInstance().getExternalContext().getRequest();
this.path = request.getRequestURL().toString().replace("login.seam", "download.seam");
}