How to I effectively use elements retrieved from Selenium that are stored in variables? I am using python. In the program below:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.select import Select
driver = webdriver.Firefox()
driver.get("https://boards.4chan.org/wg/archive")
matching_threads = []
key = "Pixel"
for i in driver.find_elements_by_class_name("teaser-col"):
if key in i.text:
matching_threads.append(i)
matched_thread = i
print(matching_threads)
driver.quit()
I get the following from the printout of matching_threads:
[<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="aa74a4a6-5bb2-4b48-92b6-50f5d51a9e5c", element="59b6076f-a5a2-4862-9c1f-028025e4b567")>]
How can I use that output to select said element in selenium and interact with it? What I am trying to do is goto that element and then click on the element to the right of it. What I am failing to understand is how to retrieve the element in selenium using the stored information in matching_threads.
If anyone can help me, I would very much appreciate it.
To click on the next opposing td with an a tag with class quotelink.
i.find_element_by_xpath(".//following::td[1]/a[#class='quotelink']").click()
Now if the page moves to another you could just grab the href value, insert them into an array and than loop through them with and use driver.get(). If it opens a new tab you should be fine.
.get_attribute('href')
Related
First time using selenium for web scraping a website, and I'm fairly new to python. I have tried to scrape a Swedish housing site to extract price, address, area, size, etc., for every listing for a specific URL that shows all houses for sale in a specific area called "Lidingö".
I managed to bypass the pop-up window for accepting cookies.
However, the output I get from the terminal is blank when the script runs. I get nothing, not an error, not any output.
What could possibly be wrong?
The code is:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
s = Service("/Users/brustabl1/hemnet/chromedriver")
url = "https://www.hemnet.se/bostader?location_ids%5B%5D=17846&item_types%5B%5D=villa"
driver = webdriver.Chrome(service=s)
driver.maximize_window()
driver.implicitly_wait(10)
driver.get(url)
# The cookie button clicker
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "/html/body/div[62]/div/div/div/div/div/div[2]/div[2]/div[2]/button"))).click()
lists = driver.find_elements(By.XPATH, '//*[#id="result"]/ul[1]/li[1]/a/div[2]')
for list in lists:
adress = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[2]/a/div[2]/div/div[1]/div[1]/h2')
area = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[1]/div[1]/div/span[2]')
price = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[2]/div[1]/div[1]')
rooms = list.find_element(By.XPATH,'//*
[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[2]/div[1]/div[3]')
size = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[2]/div[1]/div[2]')
print(adress.text)
There are a lot of flaws in your code...
lists = driver.find_elements(By.XPATH, '//*[#id="result"]/ul[1]/li[1]/a/div[2]')
in your code this returns a list of elements in the variable lists
for list in lists:
adress = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[2]/a/div[2]/div/div[1]/div[1]/h2')
area = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[1]/div[1]/div/span[2]')
price = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[2]/div[1]/div[1]')
rooms = list.find_element(By.XPATH,'//*
[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[2]/div[1]/div[3]')
size = list.find_element(By.XPATH,'//*[#id="result"]/ul[1]/li[1]/a/div[2]/div/div[2]/div[1]/div[2]')
print(adress.text)
you are not storing the value of each address in a list, instead, you are updating its value through each iteration.And xpath refers to the exact element, your loop is selecting the same element over and over again!
And scraping text through selenium is a bad practice, use BeautifulSoup instead.
I am looking for Google input box selection, where I can't find the right ID selection; by using Selenium, I want to send keys or search terms to the Google Input box. Please, I like the perfect id selection or Xpath Selection.
from selenium import webdriver
from time import sleep
from bs4 import BeautifulSoup
driver = webdriver.Chrome('C:/chromedriver/chromedriver.exe')
driver.get("https://www.google.com")
search_bar = driver.find_element_by_id('input')
search_bar.send_keys("I want to be a Genius")
search_bar.submit()
#sleep(5)
#end
driver.close()
Here is the screenshot of the Present Google Source code for input Search Terms
Try this code:
search_bar = driver.find_element_by_name('q')
search_bar.send_keys("I want to be a Genius")
search_bar.submit()
It worked for me. Using chrome to inspect search box element there is an attribute called name with value of q
i'm fairly new to selenium and i'm building a scraper to extract info from a table.
I'm able to acces the table body by ID with no problem but when I try to access it's children they are not found.
the inspector shows the xpath for the first cell as //*[#id="tb_list"]/tr[1]/td[1] but
find_element_by_xpath(//*[#id="tb_list"]/tr[1]/td[1])
can't find it.
I also tried the following to no avail.
table = driver.find_element_by_id("tb_list")
table.find_element_by_xpath(".//tr[1]/td[1]")
it's able to find tb_list but fails to locate the children
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":".//tr[1]/td[1]"}
Everywhere I looked people suggest one of these 2 methods, what am I doing wrong? The table is dynamically populated from a database, could this be an issue?
I'm using python and the chrome web driver,
I'm hesitant to give a snippet of the html as the site is not publicly available and i dont own it.
[1] indicates first descendent. So the xpath:
//*[#id="tb_list"]/tr[1]/td[1]
can be optimized as:
//*[#id="tb_list"]/tr/td
Effectively the line of code would be:
driver.find_element(By.XPATH, "//*[#id='tb_list']/tr/td")
Ideally, to locate the element you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using XPATH:
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//*[#id='tb_list']/tr/td")))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
I tried
driver.find_elements_by_xpath("//*[contains(text()),'panel')]")
but it's only pulling through a singular result when there should be about 25.
I want to find all the xpath id's containing the word panel on a webpage in selenium and create a list of them. How can I do this?
The xpath in your question has an error, not sure if it is a typo but this will not fetch any results.
There is an extra parantheses.
Instead of :
driver.find_elements_by_xpath("//*[contains(text()),'panel')]")
It should be :
driver.find_elements_by_xpath("//*[contains(text(),'panel')]")
Tip: To test your locators(xpath/CSS) - instead of directly using it in code, try it out on a browser first.
Example for Chrome:
Right click on the web page you are trying to automate
Click Inspect and do a CTRL-F.
Type in your xpath and press ENTER
You should be able to scroll through all the matched elements and also verify the total match count
To collect all the elements containing a certain text e.g. panel using Selenium you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use the following Locator Strategy:
Using XPATH and contains():
elements = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//*[contains(., 'panel')]")))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
I have written a python script that aims to take data off a website but I am unable to navigate and loop through pages to collect the links. The website is https://www.shearman.com/people? The Xpath on the site looks like this below;
ul class="results-pagination"
li class/a href onclick="PageRequest('2', event)"
When I run the query below is says that the element is not attached to the page;
try:
# this is navigate to next page
driver.find_element_by_xpath('//ul[#class="results-pagination"]/li/[#onclick=">"]').click()
time.sleep(5)
except NoSuchElementException:
break
Any ideas what I'm doing wrong on this?
Many thanks in advance.
Chris
You can try this code :
browser.get("https://www.shearman.com/people")
wait = WebDriverWait(browser, 30)
main_tab = browser.current_window_handle
navigation_buttons = browser.find_elements_by_xpath('//ul[#class="results-pagination"]//descendant::a')
size = len(navigation_buttons )
print ('this the length of list:',navigation_buttons )
i = 0
while i<size:
ActionChains(browser).key_down(Keys.CONTROL).click(navigation_buttons [i]).key_up(Keys.CONTROL).perform()
browser.switch_to_window(main_tab)
i=i+1;
if i >= size:
break
Make sure to import these :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
Note this will open each link in new tab. As per your requirement you can click on next button using this xpath : //ul[#class="results-pagination"]//descendant::a
If you want to open links one by one in same tab , then you will have to handle stale element reference as once you will be moved out from main page , all element will become stale.