how to crawl a page whose link is hidden until it is clicked - import.io

i want to get data of a page whose link needs to be clicked.
i have tried capturing the link field with a crawler and extractor with column validation set as link and html. but it doesnot return the actual link.
only after i click the link, a pop-up opens, from where i want the data.
each landing page has around 50 such links. i want to crawl each of these links.
i tried this with a connector but things get complex as there are around 90k queries. additionally the connector doesnot return the url of the page which would be helpful.

Extracting this data really depends on the website. Import is not able to extract data from popups. But, extracting the link path may be possible, depending on the structure of the website. If you are not able to extract the data with the tool, I would suggest using an xpath to obtain the link path.
To do this navigate to the page you want this data from, right click and select "inspect element." Select where the link path is on the page, right click again and select "select xpath." Go back to your Extractor and select the "advanced settings" icon and paste in your xpath. Again, this may not work, since it is dependant of how the website it structured, but still worth a try.
Thanks,
Meg

Related

PDF file link question - can I link to a different page within the same file?

This question is about PDF format files, not JS or HTML. Inside a PDF file, I'd like to create a link to another page within the same file. This is useful for Table of Contents type page that needs to link to other pages. But the "Add Link" item in PDF editor in Acrobat doesn't seem to have this as an option -- only the opening of web links, or "documents" (external files, not the current one), etc. Welcome any pointers.
Of course, that's possible.
In the Link tool, you first set the active area (that's where you click to go to the destination.
In a first dialog, Acrobat asks for the properties of the active area. In the Link Action area, select Go to a page view. After clicking on Next, you get a next dialog, directing you to navigate to the target view (page and zoom factor). Confirm, and you have set up your linkā€¦

How do I make a link from a PDF?

I'm doing a project and I need the table of contents to have links so I can be brought to different parts of the same document with this. I know how to make a link in the program I'm using (google docs) but I'm not sure where I find the link to another page in the PDF. I know how to do this on Adobe but I don't have access to the Pro portion of Adobe. Any help would be great!
Select the word or phrase you want to be hyperlinked.
Go to insert link button. Write the destination URL, click Ok.

Display the pdf through a link in database

I am working presently on the adf side and I am stuck with some issues.
I have a page where I have to display the pdf files. The pdf files are in another site and the links are present in a column of the database.But when I try to access those links they are downloading rather than displaying. I need to display those pdf files in my inline frame rather than downloading.
I heard many suggestions like write a bean and put the file in session and get display them in page .But I am not clear.
So please help me on this.
I have a check box at the end and the checkbox should be enabled in my page only when the displayed pdf scrolled to end.
Please help me solving those issues.
When you create a link to a PDF there is only so much you can do to make it display in the browser. The most important thing you must do on the server that delivers the PDF is to make sure it is presented with the correct MIME-type and without a content-disposition header value of attachment.
After that, it's up to the browser to either show it in a browser tab or to download the file. I know Chrome will show the PDF in the browser when it's linked to, not sure if it also does that when it's linked in an iframe.
I don't think there's a reliable way to make it work the way you want, simply because it's very browser dependent.

How to download a CSV for all links in google's index?

If a website is indexed in Google, what is the best way to find all the indexed URLs of the website. Any recommended tool would be better for suggestion.
open webmaster tool
click
search traffic->choose links to your site-> it appears who links the most tab in left side and click more options
Now appears more accepted backlinks appear
Now to see Top in right corner down latest link button appears
if you click this button downloading csv format links.
if you want to check indexed pages in google
please follow the step
in google search bar you just type site:yoursite.com
Thanks,
Anandhan.P

Is there a way to display a PDF in an asp.net webpage without frames?

I have a PDF document that needs to be pulled up in the browser, edited, and saved. I can save via the embedded adobe toolbar, along with all the other acrobat functions. But, what I am trying to see is if there is a way to display the PDF in a webpage alongside web controls.
For example, in the top part of the webpage I have a dropdownlist. It has a list of PDFs. I select one and the bottom part of the webpage opens up with the PDF.
Thanks.
Are you looking for something like Scribd's iPaper viewer?
You can embed it on your site or host with them.
This is typically done with an iframe.
Sorry, you'll have to use either Frames, or iFrames. Perhaps you can also get it via an <object> tag, but that might get browser-specific.
I would contact the people at ceTe (makers of DynamicPDF). Their product permits you to dynamically replace your page output with a PDF file but this involves changing the entire page (the mime-type will be pdf). Is it possible to output the page to a panel instead? I don't think so, but they would be the people I would turn to.