How to scrape all the prices using Selenium and Python

How to scrape all the prices using Selenium and Python - selenium

I'm trying to get ticket prices from Viagogo with not luck. The scrip seems quite simple, and works with other websites but not for Viagogo. I have no issue in getting the text in the title from here.
It always return me an empty result( i.e. []). Can anyone help?
Code trials:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import pandas as pd
s = Service("[]/Downloads/chromedriver/chromedriver.exe")
driver = webdriver.Chrome(service=s)
driver.get('https://www.viagogo.com/Concert-Tickets/Country-and-Folk/Taylor-Swift-Tickets/E-151214704')
price = driver.find_elements(by=By.XPATH, value('//div[#id="clientgridtable"]/div[2]/div/div/div[3]/div/div[1]/span[#class="t-b fs16"]'))
Print(price)
[]
I'am expecting the 7 prices defined in the right hand side of the website situated just above "per ticket"

There is an error in the definition of price, it should be value='...' instead of value('...'). Moreover, you should define it using a wait command so that the driver waits for the prices to be visible on the page.
price = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "...")))
Notice that this command needs the following imports
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

To extract all the 7 prices defined in the right hand side within the website you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR and text attribute:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.f-list__cell-pricing-ticketstyle > div.w100 > span")))])
Using XPATH and get_attribute("innerHTML"):
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[#class='f-list__cell-pricing-ticketstyle']/div[#class='w100 ']/span")))])
Console Output:
['Rs.25,509', 'Rs.25,873', 'Rs.27,403', 'Rs.28,788', 'Rs.72,809', 'Rs.65,593', 'Rs.29,153', 'Rs.29,153']
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Related

Unable to retrieve element id for password field on login.microsoftonline.com page

The login process to https://login.microsoftonline.com was working fine until this weekend.
I was able to select the username by id and can pass the keys but the problem is with the password field.
This is the element i used for password
Driver.FindElementByXPath("//*[#id=\"passwordInput\"]");
I tried some other ways but they didn't work
Driver.FindElementByXPath("//*[#id=\"i0118\"]");
Driver.FindElementByXPath("//*[contains(text(), 'Password')]");
Here's the error that I got: no such element: Unable to locate element: {"method":"xpath","selector":"//*[#id="passwordInput"]"}
(when I used id=passwordInput)

If you look at the HTML code of the site, you will see that the password field is
<input name="passwd" type="password" id="i0118" autocomplete="off" etc.>
so to target the element you can use input[type=password]
See code below for the implementation
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(service=Service('your_chromedriver_path'))
driver.get('https://login.microsoftonline.com/')
# sets a maximum waiting time for .find_element() and similar commands
driver.implicitly_wait(10)
driver.find_element(By.CSS_SELECTOR, 'input[type=email]').send_keys('some_email#gmail.com')
driver.find_element(By.CSS_SELECTOR, 'input[type=submit]').click()
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "input[type=password]"))).send_keys('some_password')

There is an element, why am I getting an NoSuchElementException?

problem:
Given the following code:
from selenium import webdriver
browser = webdriver.Chrome()
browser.get('https://navercomp.wisereport.co.kr/v2/company/c1010001.aspx?cmp_cd=004000')
# move to my goal
browser.find_element_by_link_text("재무분석").click()
browser.find_element_by_link_text("재무상태표").click()
# extract the data
elem = browser.find_element_by_xpath("//*[#id='faRVArcVR1a2']/table[2]/tbody/tr[2]/td[6]")
print(elem.text)
I write this code to extract finance data.
At first, I just move to page which have wanting data.
And I copy the XPATH by Chrome Browser function.
But although there is 'text', I get faced NoSuchElementException.
Why this problem happen?
try to fix:
At first, I thought that 'is this happen because of the delay'?
Although there is almost no delay in my computer, I just try to fix it.
I add some import code and change 'elem' part:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Chrome()
browser.get('https://navercomp.wisereport.co.kr/v2/company/c1010001.aspx?cmp_cd=004000')
# move to my goal
browser.find_element_by_link_text("재무분석").click()
browser.find_element_by_link_text("재무상태표").click()
# extract the data
elem = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.XPATH, '//*[#id="faRVArcVR1a2"]/table[2]/tbody/tr[2]/td[6]')))
print(elem.text)
but as a result, only TimeoutException happens..
I don't know why these problem happens. help pls! thank u..

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
browser = webdriver.Chrome()
browser.get('https://navercomp.wisereport.co.kr/v2/company/c1010001.aspx?cmp_cd=004000')
# move to my goal
browser.find_element_by_link_text("재무분석").click()
browser.find_element_by_link_text("재무상태표").click()
elementXpath = '//table[#summary="IFRS연결 연간 재무 정보를 제공합니다."][2]/tbody/tr[2]/td[6]'
# extract the data
WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.XPATH, elementXpath)))
# Wait for the table to load
time.sleep(1)
elem = browser.find_element(by=By.XPATH, value=elementXpath)
print(elem.text)
There were several problems:
the ID of the div which wraps the table ("faRVArcVR1a2") changes every time you load the page, that's why this is not a proper way of finding the element. I changed that so that it is found by the summary of the table.
WebDriverWait doesn't return the element, that's why you have to get the element with find_element after you know it is present.
Even after you waited for the table to appear, you have to wait an additional second so that all cells of the table load. Otherwise you would get an empty string.

Selenium can't find search element inside form

I'm trying to use selenium to perform searches in lexisnexis and I can't get it to find the search box.
I've tried find_element_by using all possible attributes and I only get the "NoSuchElementException: Message: no such element: Unable to locate element: " error every time.
See screenshot of the inspection tab -- the highlighted part is the element I need
My code:
from selenium import webdriver
import numpy as np
import pandas as pd
searchTerms = r'something'
url = r'https://www.lexisnexis.com/uk/legal/news' # this is the page after login - not including the code for login here.
browser = webdriver.Chrome(executable_path = path_to_chromedriver)
browser.get(url)
I tried everything:
browser.find_element_by_id('search-query')
browser.find_element_by_xpath('//*[#id="search-query"]')
browser.find_element_by_xpath('/html/body/div/header/div/form/div[2]/input')
etc..
Nothing works. Any suggestions?

Could be possible your site is taking to long to load , in such cases you can use waits to avoid synchronization issue.
wait = WebDriverWait(driver, 10)
inputBox = wait.until(EC.element_to_be_clickable((By.XPATH, "//*[#id='search-query']")))
Note : Add below imports to your solution
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait

How to handle unexpected ad popup in hdfc site?

I am trying to automate HDFC website. While entering there is an ad popup,which sometimes come and sometimes don't. I want to handle it using try catch. Please help with this.

The trick is to first see if the element is present, act accordingly it is, else continue on as normal. You could do the following and evaluate the true/false returned by the function :)
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
wait = WebDriverWait(driver, 10)
def check_exists_by_xpath(xpath,wait):
try:
wait.until(EC.visibility_of_element_located((By.XPATH,(xpath)))
except NoSuchElementException:
return False
return True

selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element:

I'm trying to automatically generate lots of users on the webpage kahoot.it using selenium to make them appear in front of the class, however, I get this error message when trying to access the inputSession item (where you write the gameID to enter the game)
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get("http://www.kahoot.it")
gameID = driver.find_element_by_id("inputSession")
username = driver.find_element_by_id("username")
gameID.send_keys("53384")
This is the error:
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element:
{"method":"id","selector":"inputSession"}

Looks like it takes time to load the webpage, and hence the detection of webelement wasn't happening. You can either use #shri's code above or just add these two statements just below the code driver = webdriver.Firefox():
driver.maximize_window() # For maximizing window
driver.implicitly_wait(20) # gives an implicit wait for 20 seconds

Could be a race condition where the find element is executing before it is present on the page. Take a look at the wait timeout documentation. Here is an example from the docs
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
finally:
driver.quit()

In my case, the error was caused by the element I was looking for being inside an iframe. This meant I had to change frame before looking for the element:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.google.co.uk/maps")
frame_0 = driver.find_element_by_class_name('widget-consent-frame')
driver.switch_to.frame(frame_0)
agree_btn_0 = driver.find_element_by_id('introAgreeButton')
agree_btn_0.click()
Reddit source

You can also use below as an alternative to the above two solutions:
import time
time.sleep(30)

this worked for me (the try/finally didn't, kept hitting the finally/browser.close())
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get('mywebsite.com')
username = None
while(username == None):
username = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "username"))
)
username.send_keys('myusername#email.com')

It seems that your browser did not read proper HTML texts/tags, use a delay function that'll help the page to load first and then get all tags from the page.
driver = webdriver.Chrome('./chromedriver.exe')
# load the page
driver.get('https://www.instagram.com/accounts/login/')
# use delay function to get all tags
driver.implicitly_wait(20)
# access tag
driver.find_element_by_name('username').send_keys(self.username)

Also for some, it may be due to opening the new tabs when clicking the button(not in this question particularly). Then you can switch tabs by command.
driver.switch_to.window(driver.window_handles[1]) #for switching to second tab

I had the same problem as you and this solution saved me:
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.ID, 'someid')))

it just means the function is executing before button can be clicked. Example solution:
from selenium import sleep
# load the page first and then pause
sleep(3)
# pauses executing the next line for 3 seconds

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to scrape all the prices using Selenium and Python - selenium

Related

Unable to retrieve element id for password field on login.microsoftonline.com page

There is an element, why am I getting an NoSuchElementException?

Selenium can't find search element inside form

How to handle unexpected ad popup in hdfc site?

selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element:

Categories

Resources