Scroll down at the bottom of a webpage (selenium/python)

Scroll down at the bottom of a webpage (selenium/python) - selenium

I am trying to get all the images from this webpage: "https://www.airbnb.com/rooms/43871809/photos?guests=1&adults=1"
I am using XPath to get all the images but if I don't scroll down at the bottom then XPath only gets 13 images when it should get 39.
I am using the following code:
s = Service('D:\Selenium driver\chromedriver2.exe')
driver = webdriver.Chrome(service=s)
url = 'https://www.airbnb.com/rooms/43871809/photos?guests=1&adults=1'
driver.get(url)
time.sleep(4)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
images = driver.find_elements_by_xpath('//div[#class="_1oaklsk"]/div/div/picture/img')
I am other methods to create scroll action. But I think the problem lies with the page. Can anyone provide me a solution with the scrolling or any other method getting all the 39 images.
P.S: I am new at this and still learning and I appreciate your help. Thanks

I tried with below xpath and it returned 39 as the count of the images.
//div[#data-testid='photo-viewer-section']//a
The code:
#Imports Required:
from selenium.webdriver.common.by import By
driver.get("https://www.airbnb.com/rooms/43871809/photos?guests=1&adults=1")
images_count = driver.find_elements(By.XPATH,"//div[#data-testid='photo-viewer-section']//a")
print(len(images_count))
Output:
39

Related

Failing to scrape the full page from Google Search results using selenium

I'm trying to scrape Google results using selenium chromedriver. Before, I used requests + Beautifulsoup to scrape google Results, and this worked, however I got blocked from Google after around 300 results. I've been reading into this topic and it seems to me that using selenium + webdriver is less easily blocked by Google.
Now, I'm trying to scrape Google results using selenium. I would like to scrape the title, link and description of all items. Essentially, I want to do this: How to scrape all results from Google search results pages (Python/Selenium ChromeDriver)
NoSuchElementException: no such element: Unable to locate element:
{"method":"css selector","selector":"h3"} (Session info:
chrome=90.0.4430.212)
Therefore, I'm trying another code. This code is able to scrape some, but not ALL the titles + descriptions. See picture below. I cannot scrape the last 4 titles, and the last 5 descriptions are also empty. Any clues on this? Much appreciated.
import urllib
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
root = "https://www.google.com/"
url = "https://google.com/search?q="
query = 'Why do I only see the first 4 results?' # Fill in google query
query = urllib.parse.quote_plus(query)
link = url + query
print(f'Main link to search for: {link}')
options = Options()
# options.headless = True
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options)
driver.get(link)
wait = WebDriverWait(driver, 30)
wait.until(EC.presence_of_all_elements_located((By.XPATH, './/h3')))
link_tag = './/div[#class= "yuRUbf"]/a'
title_tag = './/h3'
description_tag = './/span[#class= "aCOpRe"]'
titles = driver.find_elements_by_xpath(title_tag)
links = driver.find_elements_by_xpath(link_tag)
descriptions = driver.find_elements_by_xpath(description_tag)
for t in titles:
print('title:', t.text)
for l in links:
print('links:', l.get_attribute("href"))
for d in descriptions:
print('descriptions:', d.text)
# Why are the last 4 titles and the last 5 descriptions empty??
Image of the results:

Cause those 4 are not the actual links, Google always show "People also ask". If you see their DOM structure
<div style="padding-right:24px" jsname="xXq91c" class="cbphWd" data-
kt="KjCl66uM1I_i7PsBqYb-irfI74DmAeDWm-uv7IveYLKIxo-bn9L1H56X2ZSUy9L-6wE"
data-hveid="CAgQAw" data-ved="2ahUKEwjAoJ2ivd3wAhXU-nMBHWj1D8EQuk4oAHoECAgQAw">
How do I get Google to show all results?
</div>
it is not an anchor tag so you won't see href tag so your links list will have 4 empty value cause there are 4 divs like that.
to grab those 4 you need to use different locator :
XPATH : //*[local-name()='svg']/../following-sibling::div[#style]
title_tags = driver.find_elements(By.XPATH, "//*[local-name()='svg']/../following-sibling::div[#style]")
for title in title_tags:
print(title.text)

Scrolling through pages with Python Selenium

I have written a python script that aims to take data off a website but I am unable to navigate and loop through pages to collect the links. The website is https://www.shearman.com/people? The Xpath on the site looks like this below;
ul class="results-pagination"
li class/a href onclick="PageRequest('2', event)"
When I run the query below is says that the element is not attached to the page;
try:
# this is navigate to next page
driver.find_element_by_xpath('//ul[#class="results-pagination"]/li/[#onclick=">"]').click()
time.sleep(5)
except NoSuchElementException:
break
Any ideas what I'm doing wrong on this?
Many thanks in advance.
Chris

You can try this code :
browser.get("https://www.shearman.com/people")
wait = WebDriverWait(browser, 30)
main_tab = browser.current_window_handle
navigation_buttons = browser.find_elements_by_xpath('//ul[#class="results-pagination"]//descendant::a')
size = len(navigation_buttons )
print ('this the length of list:',navigation_buttons )
i = 0
while i<size:
ActionChains(browser).key_down(Keys.CONTROL).click(navigation_buttons [i]).key_up(Keys.CONTROL).perform()
browser.switch_to_window(main_tab)
i=i+1;
if i >= size:
break
Make sure to import these :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
Note this will open each link in new tab. As per your requirement you can click on next button using this xpath : //ul[#class="results-pagination"]//descendant::a
If you want to open links one by one in same tab , then you will have to handle stale element reference as once you will be moved out from main page , all element will become stale.

Java Selenium - How to Click a Button that has no ID or ng-class in AngularJS based page

The following code I've used is not clicking the button and showing the error message.
WebElement clickNextButton = webDriver.findElement(By.cssSelector("button[ng-class='btn-success']"));
clickNextButton.click();
Error message shows "no such element: Unable to locate element. {"method":"css selector","selector":"button[ng-class='btn-success']"}
I've also tried the following code segments without success:
WebElement clickNext1 = webDriver.findElement(By.cssSelector("button[ng-class='pccCTRL.pow.Page1InputsValid() ? 'btn-success' : 'btn-default'']"));
clickNext1.click();
webDriver.findElement(By.partialLinkText("Next")).click();
webDriver.findElement(By.cssSelector("button[type='button']")).click();
Here is the screenshot showing the html code segment of the button I'm trying to
Hoping to get feedback. Thank you.

Ok, I was able to solve this issue by using By.xpath
Here is the code segment that solved it:
WebElement clickNextButton = webDriver.findElement(By.xpath("//button[contains(text(),'Next')]"));
clickNextButton.click();

Selenium cannot find the element under iframe

I'm trying to find the element under a iframe, and I've switch to the frame, but I still can not find the element
enter image description here
my HTML is in the link: http://pastebin.com/AShYrdxQ

Hi first of all i found only one iframe on the page with id = msgframe and also please note that as per your source code that i frame is commented so not playing any role hence please do not use switch to driver simply use
List<WebElement> commonElements = driver.findElements(By.className("Apps_Title"));
for(int i =0;i<commonElements.size();i++){
System.out.println(commonElements.get(i).getText());
}
and it will work thanks hope this will help you.

Try:
driver.switchTo().frame("needle-frame-id-or-name");
sometimes it's help me too:
driver.switchTo().defaultContent();
driver.switchTo().frame("needle-frame-id-or-name");

Scrapy + Selenium + Datepicker

So i need to scrap a page like this for example and i am using Scrapy + Seleninum to interact with the datepicker calendar but i am running into a ElementNotVisibleException: Message: Element is not currently visible and so may not be interacted with.
So far i have:
def parse(self, response):
self.driver.get("https://www.airbnb.pt/rooms/9315238")
try:
element = WebDriverWait(self.driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//input[#name='checkin']"))
)
finally:
x = self.driver.find_element_by_xpath("//input[#name='checkin']").click()
import ipdb;ipdb.set_trace()
self.driver.quit()
I saw some references on how to achieve this https://stackoverflow.com/a/25748322/977622 and https://stackoverflow.com/a/19009256/977622 .
I appreciate if someone could help me out with my issue or even provide a better example on how i can interact the this datepicker calendar.

There are two elements with name="checkin" - the first one that you actually find is invisible. You need to make your locator more specific to match the desired input. I would also use the visibility_of_element_located condition instead:
element = WebDriverWait(self.driver, 10).until(
EC.visibility_of_element_located((By.CSS_SELECTOR, ".book-it-panel input[name=checkin]"))
)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Scroll down at the bottom of a webpage (selenium/python) - selenium

Related

Failing to scrape the full page from Google Search results using selenium

Scrolling through pages with Python Selenium

Java Selenium - How to Click a Button that has no ID or ng-class in AngularJS based page

Selenium cannot find the element under iframe

Scrapy + Selenium + Datepicker

Categories

Resources