I wrote a script to scrape data from SubGraph APIs. It simply click the run button and gets some output code. The problem is that I dhould scroll until the end of the page to get the full output code, unless I get it cutted. This is the way I tried:
def find_datasets():
datasets_url = []
s=Service(ChromeDriverManager().install())
options = Options()
options.headless = False
options.add_argument('window-size=800,600')
driver = webdriver.Chrome(service=s, options=options)
driver.get("https://v4.subgraph.polygon.oceanprotocol.com/subgraphs/name/oceanprotocol/ocean-subgraph/graphql?query=%7B%0A%20%20pools(orderBy%3A%20createdTimestamp%2C%20orderDirection%3A%20desc)%20%7B%0A%20%20%20%20id%0A%20%20%20%20datatoken%20%7B%0A%20%20%20%20%20%20address%0A%20%20%20%20%7D%0A%20%20%20%20publishMarketSwapFee%0A%20%20%20%20liquidityProviderSwapFee%0A%20%20%7D%0A%7D%0A")
sleep(15)
driver.find_element(by=By.XPATH, value="//button[contains(#class, 'execute-button')]").click()
sleep(8)
element = driver.find_elements(by=By.XPATH, value="//div[contains(#class, 'CodeMirror-lines')]")
driver.execute_script("arguments[0].scrollIntoView(true);", element);
print(driver.find_element(by=By.XPATH, value="//div[contains(#class, 'result-window')]").text)
driver.get_screenshot_as_file("screenshot.png")
What am I missing? Thank you for your patience.
have you tried with the class Actions ?
Example:
menu = driver.find_element(By.CSS_SELECTOR, ".nav")
hidden_submenu = driver.find_element(By.CSS_SELECTOR, ".nav #submenu1")
actions = ActionChains(driver)
actions.move_to_element(menu)
actions.click(hidden_submenu)
actions.perform()
Or I think .. you could try to put only
driver.execute_script("arguments[0].scrollIntoView();", element)
before do action on the target element, you need invoke scrollIntoView() first.
Related
I want to scrape this this for some of my natural language processing work. I have a subscription to the website but still, I am not able to get the result. I got the error that unable to locate the element.
The link to login page is login
This is the code that I tried in python with selenium.
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--incognito')
options.add_argument('--headless')
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_experimental_option('useAutomationExtension', False)
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_argument("disable-infobars")
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver", options=options)
driver.get('https://login.newscorpaustralia.com/login?state=hKFo2SBmOXc1TjRJNDlBX3hObkZPN1NsRWgzcktONTlPVnJMS6FupWxvZ2luo3RpZNkgUi1ZRmV2Z2dwcWJmZUpqdWtZdk5CUUllX0h3YngwanSjY2lk2SAwdjlpN0tvVzZNQkxTZmUwMzZZU1FUNzl6QThaYXo0WQ&client=0v9i7KoW6MBLSfe036YSQT79zA8Zaz4Y&protocol=oauth2&response_type=token%20id_token&scope=openid%20profile&audience=newscorpaustralia&site=couriermail&redirect_uri=https%3A%2F%2Fwww.couriermail.com.au%2Fremote%2Fidentity%2Fauth%2Flatest%2Flogin%2Fcallback.html%3FredirectUri%3Dhttps%253A%252F%252Fwww.couriermail.com.au%252Fsearch-results%253Fq%253Djason%252520huny&prevent_sign_up=true&nonce=7j4grLXRD39EVhGsxcagsO5c-PtAY4Md&auth0Client=eyJuYW1lIjoiYXV0aDAuanMiLCJ2ZXJzaW9uIjoiOS4xOS4wIn0%3D')
time.sleep(10)
elem = driver.find_element(by=By.CLASS_NAME,value='navigation_search')
username = driver.find_element(by=By.ID,value='1-email')
password = driver.find_element(by=By.NAME,value='password')
login = driver.find_element(by=By.NAME,value='submit')
username.send_keys("myid");
password.send_keys("password");
login.click();
time.sleep(20)
soup = BeautifulSoup(driver.page_source, 'html.parser')
search = driver.find_element(by=By.CSS_SELECTOR,value='form.navigation_search')
search.click();
search.send_keys("jason hunt");
print(driver.page_source)
Below is the error that I am getting. I want to grab the search icon and send the keys there but I am not getting the search form after login.
Below is the text based HTML of the element.
I tried printing the page source and I was not able to locate the html element there too.
Not a proper answer, but since you can't add formatting to comments and this has the same desired effect:
driver.get("https://www.couriermail.com.au/search-results");
WebDriverWait(driver, timeout=10).until(lambda d: d.find_element(By.CLASS_NAME, "search_box_input"))
searchBox = driver.find_element(By.CLASS_NAME, "search_box_input")
searchBox.send_keys("test");
I'm trying to scrape data from webpage.
Below code works on my local window machine but it doesn't work on ec2 linux instance.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
option = webdriver.firefox.options.Options()
option.add_argument("--headless")
driver = webdriver.Firefox(executable_path=dir_firefox_driver, options=option)
driver.get("https://www.mwave.me/en/mcountdown")
time.sleep(3)
driver.set_window_size(600, 800)
temp = driver.find_element_by_css_selector(".chart_view_more button")
I saw the article that wait until element appear, so I tried below code
driver = webdriver.Firefox(executable_path=dir_firefox_driver, options=option)
driver.get("https://www.mwave.me/en/mcountdown")
time.sleep(3)
driver.set_window_size(600, 800)
try:
element = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.CSS_SELECTOR, ".chart_view_more button"))
)
except:
print("there is no element")
quit()
temp = driver.find_element_by_css_selector(".chart_view_more button")
It also doesn't work. I can't find difference in my local machine and ec2 instance.
Can somebody give me any suggestion?
few things you can try to solve this issue:-
Instead of
driver.set_window_size(600, 800)
use :
driver.maximize_window()
also use the below code :
driver = webdriver.Firefox(executable_path=dir_firefox_driver, options=option)
driver.get("https://www.mwave.me/en/mcountdown")
time.sleep(3)
driver.maximize_window()
try:
element = WebDriverWait(driver, 20).until(
EC.visibility_of_element_located((By.CSS_SELECTOR, ".chart_view_more button"))
)
except:
print("there is no element")
quit()
temp = driver.find_element_by_css_selector(".chart_view_more button")
Help me please.
Selenium does not click on element and element is clickable(selenium does not generate exception).
I try use Id, css, xpath locators, nothing did not help me.
What should i do to decide my problem?
Java code example.
WebElement sector = webDriver.findElement(By.id("sector-1"));
sector.click();
After click system must open this page
Seems that you try to interact with object inside a <svg> element. If so, you cannot manage it's child elements simply using click() method.
Try this instead:
WebElement svgObject = driver.findElement(By.xpath("//polygon[#id='sector-1:canvas']"));
Actions builder = new Actions(driver);
builder.click(svgObject).build().perform();
Use below XPath :
WebElement sector = webDriver.findElement(By.xpath("//g[#id='sector-1']/polygon"));
sector.click();
OR
WebElement sector = webDriver.findElement(By.xpath("//polygon[#id='sector-1:canvas']"));
sector.click();
I decide my problem.
WebDriverWait wait = new WebDriverWait(webDriver, 10);
wait.until(ExpectedConditions.elementToBeClickable(By.xpath("//*[#id=\"sector-2:canvas\"]")));
WebElement svgObject = webDriver.findElement(By.xpath("//*[#id=\"sector-2:canvas\"]"));
Actions builder = new Actions(webDriver);
builder.click(svgObject).build().perform();
I was trying to automate the control on a page, on where there is a iframe and an element that can be controlled with AutoIT. I need to click the Scan button within the iframe. I used driver.switch_to.frame("frmDemo") to switch frame, but it seemed not working. Any idea please?
Here is the code:
import win32com.client
import time
from selenium import webdriver
autoit = win32com.client.Dispatch("AutoItX3.Control")
# create a new Firefox session
driver = webdriver.Firefox()
driver.implicitly_wait(30)
driver.get("http://example.com")
time.sleep(2)
driver.switch_to.frame("frmDemo")
scanButton = driver.find_element_by_css_selector('body.input[type="button"]')
scanButton.click()
input is not class, its child element of body. Try without body
scanButton = driver.find_element_by_css_selector('input[type="button"]')
You can also try by the value attribute
scanButton = driver.find_element_by_css_selector('value="Scan"')
I'm using selenium to click to the web page I want, and then parse the web page using Beautiful Soup.
Somebody has shown how to get inner HTML of an element in a Selenium WebDriver. Is there a way to get HTML of the whole page? Thanks
The sample code in Python
(Based on the post above, the language seems to not matter too much):
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup
url = 'http://www.google.com'
driver = webdriver.Firefox()
driver.get(url)
the_html = driver---somehow----.get_attribute('innerHTML')
bs = BeautifulSoup(the_html, 'html.parser')
To get the HTML for the whole page:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://stackoverflow.com")
html = driver.page_source
To get the outer HTML (tag included):
# HTML from `<html>`
html = driver.execute_script("return document.documentElement.outerHTML;")
# HTML from `<body>`
html = driver.execute_script("return document.body.outerHTML;")
# HTML from element with some JavaScript
element = driver.find_element_by_css_selector("#hireme")
html = driver.execute_script("return arguments[0].outerHTML;", element)
# HTML from element with `get_attribute`
element = driver.find_element_by_css_selector("#hireme")
html = element.get_attribute('outerHTML')
To get the inner HTML (tag excluded):
# HTML from `<html>`
html = driver.execute_script("return document.documentElement.innerHTML;")
# HTML from `<body>`
html = driver.execute_script("return document.body.innerHTML;")
# HTML from element with some JavaScript
element = driver.find_element_by_css_selector("#hireme")
html = driver.execute_script("return arguments[0].innerHTML;", element)
# HTML from element with `get_attribute`
element = driver.find_element_by_css_selector("#hireme")
html = element.get_attribute('innerHTML')
driver.page_source probably outdated. Following worked for me
let html = await driver.getPageSource();
Reference: https://seleniumhq.github.io/selenium/docs/api/javascript/module/selenium-webdriver/ie_exports_Driver.html#getPageSource
Using page object in Java:
#FindBy(xpath = "xapth")
private WebElement element;
public String getInnnerHtml() {
System.out.println(waitUntilElementToBeClickable(element, 10).getAttribute("innerHTML"));
return waitUntilElementToBeClickable(element, 10).getAttribute("innerHTML")
}
A C# snippet for those of us who might want to copy / paste a bit of working code some day
var element = yourWebDriver.FindElement(By.TagName("html"));
string outerHTML = element.GetAttribute(nameof(outerHTML));
Thanks to those who answered before me. Anyone in the future who benefits from this snippet of C# that gets the HTML for any page element in a Selenium test, please consider up voting this answer or leaving a comment.