Download all secret links from a map design website

Download all secret links from a map design website - selenium

There is a website which shows links on a map (map layer currently can't be shown but links can be shown as points).
To view this website, this must be followed: (Pictures 1-2-3 also shows the way)
Firstly, click this website 'http://svtbilgi.dsi.gov.tr/Sorgu.aspx',
Secondly, choose '15. Kizilirmak Havzasi' from 'Havza' tab,
Finally, click 'sorgula' bottom.
After the final stage, you should view the website ('http://svtbilgi.dsi.gov.tr/HaritaNew.aspx') where the points can be shown on a map.
Normally, I can use selenium to download web pages or can grab all links using different libraries. However, these methods can't obtain the links because they are embedded almost in a secret way.
I would like to download all the links that these points have.
For example, this script doesn't continue after 'parent_handle = driver.current_window_handle' line. I don't know why?
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Firefox(executable_path=r'D:\geckodriver.exe')
driver.get("http://svtbilgi.dsi.gov.tr/Sorgu.aspx")
driver.find_element_by_id("ctl00_hld1_cbHavza").click()
Select(driver.find_element_by_id("ctl00_hld1_cbHavza")).select_by_visible_text("15. Kizilirmak Havzasi")
driver.find_element_by_id("ctl00_hld1_cbHavza").click()
driver.find_element_by_id("ctl00_hld1_btnListele").click()
parent_handle = driver.current_window_handle
all_urls = []
all_images = driver.find_elements_by_xpath("//div[contains(#id,'OL_Icon')]/img")
for image in all_images :
image.click()
for handle in driver.window_handles :
if handle != parent_handle:
driver.switch_to_window(handle)
WebDriverWait(driver, 5).until(lambda d: d.execute_script('return document.readyState') == 'complete')
all_urls.append(driver.current_url)
driver.close()
driver.switchTo.window(parent_handle)

Why not click them one by one and then get the URL of the opened window, using driver.getCurrentUrl()?
In the below code, first I wait for all the images and then perform the click action using ActionChains class since the normal Selenium click() wasn't working.
Complete code in Python -
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome(executable_path=r'D:\Test automation\chromedriver.exe')
driver.get("http://svtbilgi.dsi.gov.tr/Sorgu.aspx")
driver.find_element_by_id("ctl00_hld1_cbHavza").click()
Select(driver.find_element_by_id("ctl00_hld1_cbHavza")).select_by_visible_text("15. Kizilirmak Havzasi")
driver.find_element_by_id("ctl00_hld1_btnListele").click()
parent_handle = driver.current_window_handle
driver.maximize_window()
all_urls = []
all_images = WebDriverWait(driver, 15).until(EC.presence_of_all_elements_located((By.XPATH,"//div[contains(#id,'OL_Icon')]/img")))
print len(all_images)
for image in all_images :
webdriver.ActionChains(driver).move_to_element(image).click(image).perform()
for handle in driver.window_handles :
if handle != parent_handle:
driver.switch_to_window(handle)
WebDriverWait(driver, 15).until(lambda d: d.execute_script('return document.readyState') == 'complete')
all_urls.append(driver.current_url)
driver.close()
driver.switch_to.window(parent_handle)
print all_urls

Related

Scraping web-page with button-multitems click

I'am trying to scrape this web page: https://whalewisdom.com/filer/fisher-asset-management-llc#tabholdings_tab_link
I would like to setup the python selenium code, in order to setup correctly multitems in: "50" pages per page
But my code click on wrong button. where is my code error?
options = webdriver.FirefoxOptions()
options.binary_location = r'C:/Users/Mozilla Firefox/firefox.exe'
driver = selenium.webdriver.Firefox(executable_path='C:/geckodriver.exe' , options=options)
driver.execute("get", {'url': 'https://whalewisdom.com/filer/fisher-asset-management-llc#tabholdings_tab_link'})
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//label[#id='qtr-1-label']"))))
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//*[#class='btn btn-default dropdown-toggle']"))).click()
thank you for your help.
-ag

You code clicked on wrong button because you have multiple elements with exact same class and you are fetching the first one and clicking on it.
Also I see on the page, you sometime get a popup which may make other elements not interactable. SO we would want the popup to close first(if appeared) then move ahead.
Using Chrome driver
Setup and Imports
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
# REPLACE YOUR CHROME PATH HERE
chrome_path = r"C:\Users\hpoddar\Desktop\Tools\chromedriver_win32\chromedriver.exe"
s = Service(chrome_path)
driver = webdriver.Chrome(service=s)
Fetch the page
driver.get(' https://whalewisdom.com/filer/fisher-asset-management-llc#tabholdings_tab_link')
Close the popup(if appeared)
try:
popup = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//a[#id='dfwid-close-184302']")))
popup.click()
except TimeoutException:
print("No Popup appeared on the page")
Click on dropdown and the menu item 50
dropdown = driver.find_element(By.CSS_SELECTOR, '.btn-group.dropdown')
dropdown.click()
ele50 = driver.find_element(By.XPATH, '//li[#role="menuitem"]/a[contains(text(), "50")]')
ele50.click()
Output
The above code clicks on item 50
Using Firefox driver
The imports would be same as above, the following code would also remains some with just a minute change.
# REPLACE YOUR FIREFOX DRIVER PATH HERE
firefoxpath = r'C:\Users\hpoddar\Desktop\Tools\firefoxdriver\geckodriver.exe'
s = Service(firefoxpath)
driver = webdriver.Firefox(service=s)
driver.get(' https://whalewisdom.com/filer/fisher-asset-management-llc#tabholdings_tab_link')
try:
popup = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//a[#id='dfwid-close-184302']")))
popup.click()
except TimeoutException:
print("No Popup appeared on the page")
dropdown = driver.find_element(By.CSS_SELECTOR, '.btn-group.dropdown')
dropdown.click()
ele50 = driver.find_element(By.XPATH, '//li[#role="menuitem"]/a[contains(text(), "50")]')
ele50.click()
Output
which similarly clicks on the desired element

Can't determine the element on Spotify bar search

Path = r"C:\WebDriverEdge\chromedriver.exe"
service = Service(Path)
options = Options()
options.add_argument('--user-data-dir=C:\\Users\\Admin\\AppData\\Local\\Google\\Chrome\\User Data\\')
options.add_argument("--profile-directory=Profile 1")
#connect to driver
driver = webdriver.Chrome(service = service, options = options)
driver.get("https://open.spotify.com/search")
x_path = '//*[#id="main"]/div/div[2]/div[1]/header/div[3]/div/div/form/input'
search = driver.find_element(By.XPATH, x_path)
action = webdriver.ActionChains(driver)
action.move_to_element(search).send_keys("Let me").perform()
I try to click on search bar at Spotify and use it to search. My problem is when I already login my code get error "unable to find element" but without sign in I can fill the search bar easily.
I don't know why. Is any one run into this before? Thanks in advance
p/s: XPath still the same

I'd rather use this form[role='search'] input css with explicit wait like below:
driver.get("https://open.spotify.com/search")
driver.maximize_window()
wait = WebDriverWait(driver, 20)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "form[role='search']"))).send_keys('Let me')
Imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Get text inside the href link inside the span marker using Selenium

How to extract the text which is displayed as part of the link inside the span marker.
<span class="pull-left w-100 font30 medium_blue_type mb10"><a href='/XLY'>XLY</a></span> <span class="w-100">Largest Allocation</span>
Output:
XLY
I've tried several approaches, among all, using
elems = driver.find_elements_by_class_name("span.pull-left.w-100.font30.medium_blue_type.mb10")
elems = driver.find_element_by_xpath('.//span[#class = "pull-left w-100 font30 medium_blue_type mb10"]')
but can't get it working. The website is https://www.etf.com/stock/TSLA.
EDIT:
Is it possible to do it without opening the window in the browser, e.g. using "headless" option?
op = webdriver.ChromeOptions()
op.add_argument('headless')
driver = webdriver.Chrome(CHROME_DRIVER_PATH, options=op)

If you prefer to have a text-based locators, you can use the below:
//span[text()='Largest Allocation']/../span
You should click on the cookies I understand button first.
Make use of explicit waits.
So your effective code would be:
driver = webdriver.Chrome(driver_path)
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get("https://www.etf.com/stock/TSLA")
try:
wait.until(EC.element_to_be_clickable((By.LINK_TEXT, "I Understand"))).click()
print("Clicked on I understand button")
except:
pass
txt = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[text()='Largest Allocation']/../span"))).text
print(txt)
Imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Output:
Clicked on I understand button
XLY
Process finished with exit code 0
If you are looking for locators not based on text, use the below line of code:
txt = wait.until(EC.visibility_of_element_located((By.XPATH, "(//span[contains(#class,'medium_blue_type')]//a)[2]"))).text

There are several possible problems here:
Maybe you are missing a delay
The locator you are using may be not unique
I can see here you are extracting the attribute value from the returned web element
The web element can be inside iframe etc.
Based on currently shared information you can try adding a wait and extracting the web element value as following:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 20)
href = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[#class = "pull-left w-100 font30 medium_blue_type mb10"]"))).get_attribute("href")

Use the following xpath to identify the href link.
//div[./span[text()='Largest Allocation']]//a
You need to induce some delay to get the element.
Use WebDriverWait() and wait for visibility of the element.
To get the text:
print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH, "//div[./span[text()='Largest Allocation']]//a"))).text)
To get the href:
print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH, "//div[./span[text()='Largest Allocation']]//a"))).get_attribute("href"))
you need to import below libraries.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Error: 'list' object has no attribute 'click' - Selenium Webdriver [duplicate]

I'd like to click the button 'Annual' at a page that is by default set on 'Quarterly'. There are two links that are basically called the same, except that one has data-ptype="Annual" so I tryed to copy the xpath to click the button (also tried other options but none did work).
However, I get the AttributeError: 'list' object has no attribute 'click'. I read a lot of similar posts, but wasn't able to fix my problem.. so I assume that javascript event must be called/clicked/performed somehow differnt.. idk Im stuck
from selenium import webdriver
link = 'https://www.investing.com/equities/apple-computer-inc-balance-sheet'
driver = webdriver.Firefox()
driver.get(link)
elm = driver.find_elements_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()
The html is the following:
<a class="newBtn toggleButton LightGray" href="javascript:void(0);" data-type="rf-type-button" data-ptype="Annual" data-pid="6408" data-rtype="BAL">..</a>

you need to use find_element_by_xpath not find_elements_by_xpath that return a list
driver.find_element_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()
Also i think is better to use Waits for example.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument("--window-size=1920,1080")
driver = webdriver.Firefox(firefox_options=options)
path = "/html/body/div[5]/section/div[8]/div[1]/a[1]"
try:
element = WebDriverWait(driver, 5).until(
EC.element_to_be_clickable((By.XPATH, path)))
element.click()
finally:
driver.quit()

I would still suggest you to go with linkText over XPATH. Reason this xpath : /html/body/div[5]/section/div[8]/div[1]/a[1] is quite absolute and can be failed if there is one more div added or removed from HTML. Whereas chances of changing the link Text is very minimal.
So, Instead of this code :
elm = driver.find_elements_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()
try this code :
annual_link = driver.find_element_by_link_text('Annual')
annual_link.click()
and yes #Druta is right, use find_element for one web element and find_elements for list of web element. and it is always good to have explicit wait.
Create instance of explicit wait like this :
wait = WebDriverWait(driver,20)
and use the wait reference like this :
wait.until(EC.elementToBeClickable(By.LINK_TEXT, 'Annual'))
UPDATE:
from selenium import webdriver
link = 'https://www.investing.com/equities/apple-computer-inc-balance-sheet'
driver = webdriver.Firefox()
driver.maximize_window()
wait = WebDriverWait(driver,40)
driver.get(link)
driver.execute_script("window.scrollTo(0, 200)")
wait.until(EC.element_to_be_clickable((By.LINK_TEXT, 'Annual')))
annual_link = driver.find_element_by_link_text('Annual')
annual_link.click()
print(annual_link.text)
make sure to import these :
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

As per the documentation find_elements_by_xpath(xpath) returns a List with elements if any was found or else an empty list if not. Python's List have no click() method associated with it. Instead find_element_by_xpath(xpath) method have the click() method associated with it. So you have to use find_element_by_xpath(xpath) method inducing a waiter through WebDriverWait inconjunction with expected_conditions set as element_to_be_clickable(locator) as follows:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[#class='newBtn toggleButton LightGray' and #data-type='rf-type-button']"))).click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Notice that find_elements_by_xpath is plural it returns a list of elements. Not just one. The list can contain none, exactly one, or more elements.
You can for example click the first match with:
driver.find_elements_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]")[0].click()
or iterate through the list and click all these elements, or you can use the find_element_by_xpath (which returns a single element, if it can be found):
driver.find_element_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()

For me, it was not working, and tried a hell lot of tricks, and none worked. Some people recommended driver.implicitly_wait(10) instead of time.sleep(10) which didn't work. so please try giving time.sleep(10) both above and below the .click() code line, and check if it works or not.

Switch to the frame that contains the login prompts in Selenium

I'm trying to use Selenium to log into ESPN. A solution that used to work is detailed here. In order to log in I need to find the frame that has the username and password fields and switch to that frame. Unfortunately, that numerical index of that frame isn't always the same. I decided to just try them all, but as soon as I've switched to one frame, switching to the next fails with selenium.common.exceptions.StaleElementReferenceException: Message: The element reference of <iframe class="ob-pifr"> stale: either the element is no longer attached to the DOM or the page has been refreshed. So I'm looking for
A way to switch frames without the StaleElementReferenceException
A way to check whether a frame is the one I want before I switch to it
Some other solution, though I'd prefer something introspective to a magic (and presumably fragile) formula like "It's always the third frame from the end)
Here's some sample code that leads to the StaleElementReferenceException:
from time import sleep
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Firefox(executable_path = '/usr/local/bin/geckodriver')
driver.get("http://games.espn.go.com/ffl/signin")
WebDriverWait(driver, 1000).until(EC.presence_of_all_elements_located((By.XPATH,"(//iframe)")))
elem = None
frms = driver.find_elements_by_xpath("(//iframe)")
print("Found {} frames", len(frms)) # Varies from 6 to 8
for count, frm in enumerate(frms):
print("Trying frm[{}]".format(count))
driver.switch_to.frame(frm)
sleep(2)
try:
# The command below will fail the second time around with
# `either the element is no longer attached to the DOM or the page has been refreshed`
elem = driver.find_element_by_xpath("(//input)[1]")
except NoSuchElementException:
pass
else:
break

Frame id is disneyid-iframe which opens the login popup So first you need to switch into it
driver.switch_to_frame(driver.find_element_by_id("disneyid-iframe"))
and then perform sendkeys like
driver.find_element_by_xpath("//input[#type='email']").send_keys("emailid")
driver.find_element_by_xpath("//input[#type='password']").send_keys("password")
Other way to switch in frame is ExplicitWait. It wait until frame available once there then it get switched in
wait = WebDriverWait(driver, 60)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "disneyid-iframe")))
Your Final code will be :
from time import sleep
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Firefox(executable_path = '/usr/local/bin/geckodriver')
driver.get("http://games.espn.go.com/ffl/signin")
wait = WebDriverWait(driver, 60)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "disneyid-iframe")))
driver.find_element_by_xpath("//input[#type='email']").send_keys("emailid")
driver.find_element_by_xpath("//input[#type='password']").send_keys("password")
driver.find_element_by_xpath("//button[#type='submit']").click()
Note : Please check syntax as per Python.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Download all secret links from a map design website - selenium

Related

Scraping web-page with button-multitems click

Can't determine the element on Spotify bar search

Get text inside the href link inside the span marker using Selenium

Error: 'list' object has no attribute 'click' - Selenium Webdriver [duplicate]

Switch to the frame that contains the login prompts in Selenium

Categories

Resources