Extract the breadcrumbs of a website using selenium

Extract the breadcrumbs of a website using selenium - selenium

i need to extract the breadcrumbs of this website site: https://www.woolworths.com.au/Shop/Browse/drinks/cordials-juices-iced-teas/iced-teas
I tried to inspect the element and copy the xpath but it doesn't extract it
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('https://www.woolworths.com.au/Shop/Browse/drinks/cordials-juices-iced-teas/iced-teas')
driver.find_elements_by_xpath('//*[#id="center-panel"]/div/wow-tile-list-with-content/ng-transclude/wow-browse-tile-list/wow-tile-list/div/div[1]/div[1]/wow-breadcrumbs/div/ul/li[4]/span/span')
driver.find_element_by_css_selector('#center-panel > div > wow-tile-list-with-content > ng-transclude > wow-browse-tile-list > wow-tile-list > div > div.tileList > div.tileList-headerContainer > wow-breadcrumbs > div > ul > li:nth-child(4) > span > span')
How can I proceed?

To print the breadcrumbs of the website site: https://www.woolworths.com.au/Shop/Browse/drinks/cordials-juices-iced-teas/iced-teas you have to induce WebDriverWait for the desired visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute() method:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "ul.breadcrumbs-linkList li:nth-child(4) span span"))).get_attribute("innerHTML"))
Using XPATH and text property:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//ul[#class='breadcrumbs-linkList']//following-sibling::li[4]//span//span"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Outro
As per the documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

The page you are trying to scrape is written in Angular, meaning that most of the DOM elements are loaded dynamically by JavaScript AJAX code and they are not present once the page is loaded. (driver.get function returns)
You should use waits until function to locate such elements.
Here is the working example using the XPATH you provided:
driver.get('https://www.woolworths.com.au/Shop/Browse/drinks/cordials-juices-iced-teas/iced-teas')
try:
element = WebDriverWait(driver, 1).until(
EC.presence_of_element_located((By.XPATH, '//*[#id="center-panel"]/div/wow-tile-list-with-content/ng-transclude/wow-browse-tile-list/wow-tile-list/div/div[1]/div[1]/wow-breadcrumbs/div/ul/li[4]/span/span'))
)
print(element.text) ' this outputs Iced Teas
except TimeoutException:
print("Timeout")

Below one works for my validation
//*[span='first text' and span='Search results for "second text"']

Related

Selenium error 'Message: no such element: Unable to locate element'

I get this error when trying to get the price of the product:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element
But the thing is that I am searching by XPath from the Inspect HTML panel. SO how come it doesn't'search it?
Here is the code:
main_page = 'https://www.takealot.com/500-classic-family-computer-tv-game-ejc/PLID90448265'
PATH = 'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(PATH)
driver.get(main_page)
time.sleep(10)
active_price = driver.find_element(By.XPATH,'//*[#id="shopfront-app"]/div[3]/div[1]/div[2]/aside/div[1]/div[1]/div[1]/span').text
print(active_price)
I found the way to get the price the other way, but I am still interested why selenium can't find it by XPath:
price = driver.find_element(By.CLASS_NAME, 'buybox-module_price_2YUFa').text
active_price = ''
after_discount = ''
count = 0
for char in price:
if char == 'R':
count +=1
if count ==2:
after_discount += char
else:
active_price += char
print(active_price)
print(after_discount)

To extract the text R 345 ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR and text attribute:
driver.get("https://www.takealot.com/500-classic-family-computer-tv-game-ejc/PLID90448265")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span[data-ref='buybox-price-main']"))).text)
Using XPATH and get_attribute("innerHTML"):
driver.get("https://www.takealot.com/500-classic-family-computer-tv-game-ejc/PLID90448265")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[#data-ref='buybox-price-main']"))).get_attribute("innerHTML"))
Console Output:
R 345
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

You are trying to get a price from buybox-module_price_2YUFa but price is actually inside the child span
//span[#data-ref='buybox-price-main']
Use following xpath to get the price

Selenium: 'function' object has no attribute 'click'

I am running into this issue when trying to click a button using selenium. The button html reads as below:
<button class="Component-button-0-2-65 Component-button-d1-0-2-68">and 5 more</button>
My code is here:
button = EC.element_to_be_clickable((By.XPATH,'.//button[contains(#class,"Component-button-d")]'))
if button:
print("TRUE")
button.click()
My output is:
TRUE
Traceback (most recent call last):
File "", line 47, in <module>
button.click()
AttributeError: 'function' object has no attribute 'click'
I am stumped as to why 'button' element is found by selenium (print(True) statement is executed) but then the click() method returns an attribute error.
This is the page I am scraping data from: https://religiondatabase.org/browse/regions
I am able to extract all the information I need on the page, so the code leading up to the click is working.
I was expecting the item to be clickable. I'm not sure how to troubleshoot the attribute error (function object has no attribute click). Because it I paste the xpath into the webpage, it highlights the correct element.

Your code is incorrect. You can find below an example of how you can use Expected Conditions (along with WebdriverWait):
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
## [...define your driver here, other imports etc]
wait = WebDriverWait(driver, 25)
##[open a url, etc]
element = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="input-group"]')))
element.click()
Selenium documentation can be found here.

The element search is always done through the driver object which is the WebDriver instance in it's core form:
driver.find_element(By.XPATH, "element_xpath")
But when you apply WebDriverWait, the wait configurations are applied on the driver object.
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "element_xpath")))
So even during WebDriverWait the driver object needs to be present along with the expected_conditions and the locator strategy.
This usecase
This line of code:
button = EC.element_to_be_clickable((By.XPATH,'.//button[contains(#class,"Component-button-d")]'))
button object references to a junk value but on probing returns true and prints TRUE but the click can't be performed as button is not of WebElement type.
Solution
Given the HTML:
<button class="Component-button-0-2-65 Component-button-d1-0-2-68">and 5 more</button>
To click on the element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using XPATH with classname:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "//button[contains(#class,'Component-button-d')]"))).click()
Using XPATH with classname and innerText:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[contains(#class,'Component-button-d') and text()='and 5 more']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

How to locate the button element using Selenium?

I need to find the xpath of the help button for my automation tests.
Here is the HTML element:
<button data-component-id="nokia-react-components-iconbutton" tabindex="0" class="ActionComponent__ActionComponentDiv-sc-st3635-0 iPWdbh IconButton__StyledButton-sc-1dtic80-0 biWRLS NSPAppBanner__HelpButton-sc-761pc0-3 dvtRqt" mode="dark" type="button" xpath="1"></button>
<svg class="HelpOutline__NSPSvgIcon-sc-7y3t1-0 geSzXs HelpWidget__StyledHelp-sc-1fdz44x-0 cCTbqa" viewBox="0 0 24 24" xpath="1">
</svg>
I tried with this XPath, but it didn't work:
//*[#data-component-id="nokia-react-components-iconbutton"]//svg[contains(#class,"HelpOutline")]

If you want to select the <button> element before the <svg>, try the following XPath:
//*[#data-component-id="nokia-react-components-iconbutton" and following-sibling::svg[1][contains(#class,"HelpOutline")]]

The <button> element itself should be clickable and both the locator strategies should work:
css_selector:
button[class*='HelpButton'][data-component-id='nokia-react-components-iconbutton']
xpath:
//button[#data-component-id='nokia-react-components-iconbutton' and contains(#class, 'HelpButton')]
However, as the element is a React element so to click on the element you need to induce WebDriverWait for the element to be clickable and you can use either of the following locator strategies:
Using Java and xpath:
new WebDriverWait(driver, Duration.ofSeconds(10)).until(ExpectedConditions.elementToBeClickable(By.xpath("//button[#data-component-id='nokia-react-components-iconbutton' and contains(#class, 'HelpButton')]"))).click();
Using Python and css_selector:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[class*='HelpButton'][data-component-id='nokia-react-components-iconbutton']"))).click()
Note: For Python clients you have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Try this XPath to locate button:
'//*[name()="svg" and contains(#class,"HelpOutline")]/preceding-sibling::button'
Note that svg tag is not a part of standard HTML-namespace, so //svg won't work. You need to use //*[name()="svg"] instead

How to click on an element within an iframe with Selenium and Python [duplicate]

This question already has answers here:
Scraping Data from a website which uses Power BI - retrieving data from Power BI on a website
(2 answers)
Closed 2 years ago.
I have the following element inside an iframe. This tag displays a ">" icon and will flip to the next graph on the URL when the user clicks on it. You can see it in the following URL where it says
< 1 of 16 >
https://msdh.ms.gov/msdhsite/_static/14,21995,420,873.html
<a>
<i class="glyphicon glyph-small pbi-glyph-chevronrightmedium middleIcon active pbi-focus-outline" focus-element="" tabindex="0" title="Next Page">
</i>
</a>
Using Selenium how can I send a click action to this element to flip to the next graph?
url='https://msdh.ms.gov/msdhsite/_static/14,21995,420,873.html'
p='my/path/to/chromedriver'
driver=webdriver.Chrome(p)
driver.get(url)
myframe=driver.find_element_by_class_name("flexibleFrame")
driver.switch_to.frame(myframe)
i = driver.find_element_by_class_name("glyphicon")

The > icon is within a <iframe> so to interact with the WebElement using Selenium you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the desired visibility_of_element_located().
scrollIntoView() the WebElement.
You can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get('https://msdh.ms.gov/msdhsite/_static/14,21995,420,873.html')
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe.flexibleFrame")))
elem = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "i.glyphicon.glyph-small.pbi-glyph-chevronrightmedium.middleIcon.active.pbi-focus-outline[title='Next Page']")))
driver.execute_script("arguments[0].scrollIntoView(true);", elem)
elem.click()
Using XPATH:
driver.get('https://msdh.ms.gov/msdhsite/_static/14,21995,420,873.html')
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[#class='flexibleFrame']")))
elem = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//i[#class='glyphicon glyph-small pbi-glyph-chevronrightmedium middleIcon active pbi-focus-outline' and #title='Next Page']")))
driver.execute_script("arguments[0].scrollIntoView(true);", elem)
elem.click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:
Reference
You can find a relevant discussions in:
Ways to deal with #document under iframe

How to click a dynamic 'like' button using Selenium through Python?

Hi~in this music websit:
Music website link
i want to click on the like button in the right side of the song bar
i use below codes:
like_number=3
like_pos=f'#app > div > div.content-wrapper > div.song-list-view.list-view.view-without-leftbar > div.song-list > div > div.table.idle.song-table.song-list-table > div > table > tbody > tr:nth-child({str(like_number)}) > td:nth-child(5) > div > div > div:nth-child(1) > div'
button = self.browser.find_element_by_css_selector(like_pos)
self.browser.implicitly_wait(10)
ActionChains(self.browser).move_to_element(button).click(button).perform()
But,there is no response,console shows that my the tag is not interactive:
element not interactable” exception
I am so confused ,cause i search for whole the stack overflow ,but there is no practical solution for me
I just want to achieve a simple function of clicking on like button
Thanks if you have any great idea for me!
The hard thing is that you have to pause you mouse for a while and then click button shows ,so that you are able to click on it ,this is so wired situation.
Below is image example

To click() on the Like button in the right side of the song bar you have to induce WebDriverWait for the element_to_be_clickable() and you can use the following Locator Strategies:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("start-maximized")
driver = webdriver.Chrome(options=chrome_options)
driver.get("https://www.xiami.com/favorite/88955424")
ActionChains(driver).move_to_element(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table/tbody/tr//div[#class='duration-container ops-container']")))).perform()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//table/tbody/tr//div[#class='duration-container ops-container']//div[#class='operations ops-right']/div[#class='ops-item']/div[#class='iconfont']"))).click()
Browser Snapshot:
References
You can find a couple of relevant discussions in:
How to resolve ElementNotInteractableException: Element is not visible in Selenium webdriver?
org.openqa.selenium.ElementNotInteractableException: Element is not reachable by keyboard: while sending text to FirstName field in Facebook

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Extract the breadcrumbs of a website using selenium - selenium

Below one works for my validation //*[span='first text' and span='Search results for "second text"']

Related

Selenium error 'Message: no such element: Unable to locate element'

Selenium: 'function' object has no attribute 'click'

How to locate the button element using Selenium?

How to click on an element within an iframe with Selenium and Python [duplicate]

How to click a dynamic 'like' button using Selenium through Python?

Categories

Resources