How to get the "none display" html from selenium - selenium

I'm trying to get some context by using selenium, however I can't just get the "display: none" part content. I tried use attribute('innerHTML') but still not work as expected.
Hope if you could share some knowledge.
[Here is the html][1]
[1]: https://i.stack.imgur.com/LdDL4.png
# -*- coding: utf-8 -*-
from selenium import webdriver
import time
from bs4 import BeautifulSoup
import re
from pyvirtualdisplay import Display
from lxml import etree
driver = webdriver.PhantomJS()
driver.get('http://flights.ctrip.com/')
driver.maximize_window()
time.sleep(1)
element_time = driver.find_element_by_id('DepartDate1TextBox')
element_time.clear()
element_time.send_keys(u'2017-10-22')
element_arr = driver.find_element_by_id('ArriveCity1TextBox')
element_arr.clear()
element_arr.send_keys(u'北京')
element_depart = driver.find_element_by_id('DepartCity1TextBox')
element_depart.clear()
element_depart.send_keys(u'南京')
driver.find_element_by_id('search_btn').click()
time.sleep(1)
print(driver.current_url)
driver.find_element_by_id('btnReSearch').click()
print(driver.current_url)
overlay=driver.find_element_by_id("mask_loading")
print(driver.exeucte_script("return arguments[0].getAttribute('style')",overlay))
driver.quit()

To retrieve the attribute "display: none" you can use the following line of code:
String my_display = driver.findElement(By.id("mask_loading")).getAttribute("display");
System.out.println("Display attribute is set to : "+my_display);

if element style attribute has the value display:none, then it is a hidden element. basically selenium doesn't interact with hidden element. you have to go with javascript executor of selenium to interact with it. You can get the style value as given below.
WebElement overlay=driver.findElement(By.id("mask_loading"));
JavascriptExecutor je = (JavascriptExecutor )driver;
String style=je.executeScript("return arguments[0].getAttribute("style");", overlay);
System.out.println("style value of the element is "+style);
It prints the value "z-index: 12;display: none;"
or if you want to get the innerHTML,
String innerHTML=je.executeScript("return arguments[0].innerHTML;",overlay);
In Python,
overlay=driver.find_element_by_id("mask_loading")
style =driver.exeucte_script("return arguments[0].getAttribute('style')",overlay)
or
innerHTML=driver.execute_script("return arguments[0].innerHTML;", overlay)

Related

How to find href link from multiple browser tabs

I wrote this code to fetch the href link of images in webpage. However i found this way to do this, But Unable to find link in loop. I want that when new tab open href link of that tabs get print in shell. And the urls link are multiple.
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import urllib
import urllib.request
import time
import os
#driver = webdriver.Chrome(executable_path=r"C:\Users\umesh.kumar\Downloads\Codes\chromedriver.exe")
username = "ABCD.kumar"
password = "X0XX"
driver = webdriver.Chrome()
driver.get("http://propadmin.99acres.com/propadmin/index.htm")
urls = ["https://propadmin.99acres.com/do/seller/ProcessSellerForms/getDeletedPhotos?prop_id=A18056415", "https://propadmin.xxxxxxx.com/do/seller/ProcessSellerForms/getDeletedPhotos?prop_id=A56063622"]
driver.maximize_window()
driver.find_element(By.ID, "username").send_keys(username)
driver.find_element(By.ID, "password").send_keys(password)
driver.find_element_by_name("login").click()
for posts in range(len(urls)):
print(posts)
driver.get(urls[posts])
if(posts!=len(urls)-1):
driver.execute_script("window.open('');")
chwd = driver.window_handles
driver.switch_to.window(chwd[-1])
elems = driver.find_elements_by_tag_name('a')
for elem in elems:
href = elem.get_attribute('href')
if href is not None:
print(href)

How to scrape a specific itemprop from a web page with XPath and Selenium?

I'm trying to use Python (Selenium, BeautifulSoup, and XPath) to scrape a span with an itemprop equal to "description", but every time I run the code, the "try" fails and it prints out the "except" error.
I do see the element in the code when I inspect elements on the page.
Line that isn't getting the desired response:
quick_overview = soup.find_element_by_xpath("//span[contains(#itemprop, 'description')]")
Personally, I think you should just keep working with selenium
quick_overview = driver.find_element_by_xpath("//span[contains(#itemprop, 'description')]")
for the element and add .text to end to get the text content.
To actually use soup to parse this out you would likely need a wait condition from selenium first so no real point.
However, should you decide to integrate bs4 then you need to change your function to work with the actual html from driver.page_source and parse that, then switch to select_one to grab your item. Then ensure you are returning from the function and assigning to new soup object.
from bs4 import BeautifulSoup
from selenium import webdriver # links w/ browser and carries out actions
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
PATH = "C:\Program Files (x86)\chromedriver_win32\chromedriver.exe"
baseurl = "http://www.waytekwire.com"
skus_to_find_test = ['WL16-8', 'WG18-12']
driver = webdriver.Chrome(PATH)
driver.get(baseurl)
def use_driver_current_html(driver):
soup = BeautifulSoup(driver.page_source, 'lxml')
return soup
for sku in skus_to_find_test[0]:
search_bar = driver.find_element_by_id('themeSearchText')
search_bar.send_keys(sku)
search_bar.send_keys(Keys.RETURN)
try:
product_url = driver.find_elements_by_xpath("//div[contains(#class, 'itemDescription')]//h3//a[contains(text(), sku)]")[0]
product_url.click()
WebDriverWait(driver,10).until(EC.presence_of_element_located((By.XPATH, "//span[contains(#itemprop, 'description')]")))
soup = use_driver_current_html(driver)
try:
quick_overview = soup.select_one("span[itemprop=description]").text
print(quick_overview)
except:
print('No Quick Overview Found.')
except:
print('Product not found.')

Selenium: How use while loop to click link if it exists?

I am trying to write a Python program that uses Selenium to click a button to go to the next page if the button is clickable. This is because I am web scraping from varying amounts of pages.
I have tried to use a while loop that checks the href attribute, but the code doesn't click the button, nor does it return an error. If I simply write button.click(), but without a while loop or conditional check for the href attribute, then the program clicks the button correctly.
My code also has a while loop condition of "variable is not None". Is this a valid usage of "is not"? My logic is for the program to click the button to go to the next page if there is an href available from the to click.
Code:
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
import numpy as np
import pandas as pd
PATH = "C:\Program Files (x86)\chromedriver.exe"
wd = webdriver.Chrome(PATH)
wd.get("https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty")
time.sleep(1)
button = wd.find_element_by_xpath("""//a[#aria-label='Next page']""")
#<a tabindex="0" aria-label="Next page" class="ng-star-inserted" style=""> Next <span class="show-for-sr">page</span></a>
href_data = button.get_attribute('href')
while (href_data is not None):
time.sleep(0.5)
button.click()
href_data = button.get_attribute('href')
Would anyone here be willing to assist me with this? I understand that Selenium requires the user to download a webdriver, so I apologize for any difficulties with testing my code.
Thank you, ExactPlace441
To loop until all pages were clicked.
wd.get('https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty')
wait=WebDriverWait(wd, 10)
while True:
try:
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[#aria-label='Next page']"))).click()
time.sleep(5)
except:
break
Import
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
I faced the same problem then I used gecko driver(selenium Firefox) instead of Chrome. My code was working perfectly in selenium Firefox but same code was not working in selenium Chrome. Without while loop I hadn't any problem to click on button in selenium Chrome browser but it was not working when added while loop. After using gecko driver(selenium Firefox) my problem was solved. Here is an example of while loop that you can use. It will clicking on button until the button disappeared or reach the last page.
i = 1
try:
while i < 2:
button_element = driver.find_element_by_xpath("give your button xpath")
button_element.click() #Our loop will continuing until our button xpath disappeared from web page
except:
pass #when the button xpath will disappeared it will ignore the error and jump to the next section of our code.
Here I modified your code:
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time
import numpy as np
import pandas as pd
driver = webdriver.Firefox()
driver.maximize_window()
url = "https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty"
driver.get(url)
timeout = 20
# This container collect data from first page
containers = WebDriverWait(driver, timeout).until(EC.visibility_of_all_elements_located((By.XPATH,'//div[#class="column ng-star-inserted"]' )))
for container in containers:
name = container.find_element_by_css_selector('.header-details h5') #we are srcaping name from each page
print(name.text)
i = 1
try:
while i < 2: #Now it will look for “next page button” in every page and continuing click on “next page button” until it will reach the last page.
next_page_button = driver.find_element_by_xpath("//li[#class='pagination-next ng-star-inserted']")
next_page_button.click()
#our this container2 start collect data from second page to last page
containers = WebDriverWait(driver, timeout).until(EC.visibility_of_all_elements_located((By.XPATH,'//div[#class="column ng-star-inserted"]' )))
for container in containers:
name = container.find_element_by_css_selector('.header-details h5') #we are srcaping name from each page
print(name.text)
time.sleep(3)
except:
pass #if any page don't have “next page button” then our code will be end without any error.

Error: 'list' object has no attribute 'click' - Selenium Webdriver [duplicate]

I'd like to click the button 'Annual' at a page that is by default set on 'Quarterly'. There are two links that are basically called the same, except that one has data-ptype="Annual" so I tryed to copy the xpath to click the button (also tried other options but none did work).
However, I get the AttributeError: 'list' object has no attribute 'click'. I read a lot of similar posts, but wasn't able to fix my problem.. so I assume that javascript event must be called/clicked/performed somehow differnt.. idk Im stuck
from selenium import webdriver
link = 'https://www.investing.com/equities/apple-computer-inc-balance-sheet'
driver = webdriver.Firefox()
driver.get(link)
elm = driver.find_elements_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()
The html is the following:
<a class="newBtn toggleButton LightGray" href="javascript:void(0);" data-type="rf-type-button" data-ptype="Annual" data-pid="6408" data-rtype="BAL">..</a>
you need to use find_element_by_xpath not find_elements_by_xpath that return a list
driver.find_element_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()
Also i think is better to use Waits for example.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument("--window-size=1920,1080")
driver = webdriver.Firefox(firefox_options=options)
path = "/html/body/div[5]/section/div[8]/div[1]/a[1]"
try:
element = WebDriverWait(driver, 5).until(
EC.element_to_be_clickable((By.XPATH, path)))
element.click()
finally:
driver.quit()
I would still suggest you to go with linkText over XPATH. Reason this xpath : /html/body/div[5]/section/div[8]/div[1]/a[1] is quite absolute and can be failed if there is one more div added or removed from HTML. Whereas chances of changing the link Text is very minimal.
So, Instead of this code :
elm = driver.find_elements_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()
try this code :
annual_link = driver.find_element_by_link_text('Annual')
annual_link.click()
and yes #Druta is right, use find_element for one web element and find_elements for list of web element. and it is always good to have explicit wait.
Create instance of explicit wait like this :
wait = WebDriverWait(driver,20)
and use the wait reference like this :
wait.until(EC.elementToBeClickable(By.LINK_TEXT, 'Annual'))
UPDATE:
from selenium import webdriver
link = 'https://www.investing.com/equities/apple-computer-inc-balance-sheet'
driver = webdriver.Firefox()
driver.maximize_window()
wait = WebDriverWait(driver,40)
driver.get(link)
driver.execute_script("window.scrollTo(0, 200)")
wait.until(EC.element_to_be_clickable((By.LINK_TEXT, 'Annual')))
annual_link = driver.find_element_by_link_text('Annual')
annual_link.click()
print(annual_link.text)
make sure to import these :
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
As per the documentation find_elements_by_xpath(xpath) returns a List with elements if any was found or else an empty list if not. Python's List have no click() method associated with it. Instead find_element_by_xpath(xpath) method have the click() method associated with it. So you have to use find_element_by_xpath(xpath) method inducing a waiter through WebDriverWait inconjunction with expected_conditions set as element_to_be_clickable(locator) as follows:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[#class='newBtn toggleButton LightGray' and #data-type='rf-type-button']"))).click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Notice that find_elements_by_xpath is plural it returns a list of elements. Not just one. The list can contain none, exactly one, or more elements.
You can for example click the first match with:
driver.find_elements_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]")[0].click()
or iterate through the list and click all these elements, or you can use the find_element_by_xpath (which returns a single element, if it can be found):
driver.find_element_by_xpath("/html/body/div[5]/section/div[8]/div[1]/a[1]").click()
For me, it was not working, and tried a hell lot of tricks, and none worked. Some people recommended driver.implicitly_wait(10) instead of time.sleep(10) which didn't work. so please try giving time.sleep(10) both above and below the .click() code line, and check if it works or not.

Selenium, Autoit and iframe

I was trying to automate the control on a page, on where there is a iframe and an element that can be controlled with AutoIT. I need to click the Scan button within the iframe. I used driver.switch_to.frame("frmDemo") to switch frame, but it seemed not working. Any idea please?
Here is the code:
import win32com.client
import time
from selenium import webdriver
autoit = win32com.client.Dispatch("AutoItX3.Control")
# create a new Firefox session
driver = webdriver.Firefox()
driver.implicitly_wait(30)
driver.get("http://example.com")
time.sleep(2)
driver.switch_to.frame("frmDemo")
scanButton = driver.find_element_by_css_selector('body.input[type="button"]')
scanButton.click()
input is not class, its child element of body. Try without body
scanButton = driver.find_element_by_css_selector('input[type="button"]')
You can also try by the value attribute
scanButton = driver.find_element_by_css_selector('value="Scan"')