How to poll Selenium Hub for the number of it's registered Nodes? - testing

I searched for any answer in the Selenium Grid Documentation but couldnt find anything.
Could it somehow be possible to poll the Selenium Hub and take back the number of nodes that are registered to it?

If you check the Grid Console (http://selenium.hub.ip.address:4444/grid/console) you will find valuable information about Grid's nodes, browsers, IPs, etc.
This is my grid. I have two nodes, one Linux and one Windows:
If you go through the links (Configuration, View Config,...) you will find information about each node and browser.

I finally put together this
def grid_nodes_num(grid_console_url="http://my_super_company.com:8080/grid/console#"):
import requests
from bs4 import BeautifulSoup
r = requests.get(grid_console_url)
html_doc = r.text
soup = BeautifulSoup(html_doc)
# print soup.prettify() # for debuggimg
grid_nodes = soup.find_all("p", class_="proxyid")
if grid_nodes == []:
print "-No Nodes detected. Grid is down!-"
else:
nodes_num = len(grid_nodes)
print "-Detected ",nodes_num," node(s)-"
return nodes_num

Related

how to use time sleep to make selenium output consistent

This might be the stupidest question i asked yet but this is driving me nuts...
Basically i want to get all links from profiles but for some reason selenium gives different amounts of links most of the time ( sometimes all sometimes only a tenth)
I experimented with time.sleep and i know its affecting the output somehow but i dont understand where the problem is.
(but thats just my hypothesis maybe thats wrong)
I have no other explanation why i get incosistent output. Since i get all profile links from time to time the program is able to find all relevant profiles.
heres what the output should be (for different gui input)
input:anlagenbau output:3070
Fahrzeugbau output:4065
laserschneiden output:1311
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from selenium.common.exceptions import TimeoutException
from urllib.request import urlopen
from datetime import date
from datetime import datetime
import easygui
import re
from selenium.common.exceptions import NoSuchElementException
import time
#input window suchbegriff
suchbegriff = easygui.enterbox("Suchbegriff eingeben | Hinweis: suchbegriff sollte kein '/' enthalten")
#get date and time
now = datetime.now()
current_time = now.strftime("%H-%M-%S")
today = date.today()
date = today.strftime("%Y-%m-%d")
def get_profile_url(label_element):
# get the url from a result element
onlick = label_element.get_attribute("onclick")
# some regex magic
return re.search(r"(?<=open\(\')(.*?)(?=\')", onlick).group()
def load_more_results():
# load more results if needed // use only on the search page!
button_wrapper = wd.find_element_by_class_name("loadNextBtn")
button_wrapper.find_element_by_tag_name("span").click()
#### Script starts here ####
# Set some Selenium Options
options = webdriver.ChromeOptions()
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
# Webdriver
wd = webdriver.Chrome(options=options)
# Load URL
wd.get("https://www.techpilot.de/zulieferer-suchen?"+str(suchbegriff))
# lets first wait for the timeframe
iframe = WebDriverWait(wd, 5).until(
EC.frame_to_be_available_and_switch_to_it("efficientSearchIframe")
)
# the result parent
result_pane = WebDriverWait(wd, 5).until(
EC.presence_of_element_located((By.ID, "resultPane"))
)
#get all profilelinks as list
time.sleep(5)
href_list = []
wait = WebDriverWait(wd, 15)
while True:
try:
#time.sleep(1)
wd.execute_script("loadFollowing();")
#time.sleep(1)
try:
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".fancyCompLabel")))
except TimeoutException:
break
#time.sleep(1) # beeinflusst in irgeneiner weise die findung der ergebnisse
result_elements = wd.find_elements_by_class_name("fancyCompLabel")
#time.sleep(1)
for element in result_elements:
url = get_profile_url(element)
href_list.append(url)
#time.sleep(2)
while True:
try:
element = wd.find_element_by_class_name('fancyNewProfile')
wd.execute_script("""var element = arguments[0];element.parentNode.removeChild(element);""", element)
except NoSuchElementException:
break
except NoSuchElementException:
break
wd.close #funktioniert noch nicht
print("####links secured: "+str(len(href_list)))
Since you say that the sleep is affecting the number of results, it sounds like they're loading asynchronously and populating as they're loaded, instead of all at once.
The first question is whether you can ask the web site developers to change this, to only show them when they're all loaded at once.
Assuming you don't work for the same company as them, consider:
Is there something else on the page that shows up when they're all loaded? It could be a button or a status message, for instance. Can you wait for that item to appear, and then get the list?
How frequently do new items appear? You could poll for the number of results relatively infrequently, such as only every 2 or 3 seconds, and then consider the results all present when you get the same number of results twice in a row.
The issue is the method presence_of_all_elements_located doesn't wait for all elements matching a passed locator. It waits for presence of at least 1 element matching the passed locator and then returns a list of elements found on the page at that moment matching that locator.
In Java we have
wait.until(ExpectedConditions.numberOfElementsToBeMoreThan(element, expectedElementsAmount));
and
wait.until(ExpectedConditions.numberOfElementsToBe(element, expectedElementsAmount));
With these methods you can wait for predefined amount of elements to appear etc.
Selenium with Python doesn't support these methods.
The only thing you can see with Selenium in Python is to build some custom method to do these actions.
So if you are expecting some amount of elements /links etc. to appear / be presented on the page you can use such method.
This will make your test stable and will avoid usage of hardcoded sleeps.
UPD
I have found this solution.
This looks to be the solution for the mentioned above methods.
This seems to be a Python equivalent for wait.until(ExpectedConditions.numberOfElementsToBeMoreThan(element, expectedElementsAmount));
myLength = 9
WebDriverWait(browser, 20).until(lambda browser: len(browser.find_elements_by_xpath("//img[#data-blabla]")) > int(myLength))
And this
myLength = 10
WebDriverWait(browser, 20).until(lambda browser: len(browser.find_elements_by_xpath("//img[#data-blabla]")) == int(myLength))
Is equivalent for Java wait.until(ExpectedConditions.numberOfElementsToBe(element, expectedElementsAmount));

Selenium fails to load elements, despite EC, waits, and scrolling attempts

With the Selenium (3.141), BeautifulSoup (4.7.9), and Python (3.79), I'm trying to scrape what streaming, rental, and buying options are available for a given movie/show. I've spent hours trying to solve this, so any help would be appreciated. Apologies for the poor formatting, in terms of mixing in comments and prior attempts.
Example Link: https://www.justwatch.com/us/tv-show/24
Desired Outcome is a Beautiful soup element that I can then parse (e.g., which streaming services have it, how many seasons are available, etc.),
which has 3 elements (as of now) - Hulu, IMDB TV, and DirecTV.
I tried numerous variations, but only get one of the 3 streaming services for the example link, and even then it's not a consistent result. Often, I get an empty object.
Some of the things that I've tried included waiting for an expected condition (presence or visibility), explicitly using sleep() from the time library. I'm using a Mac (but running Linux via a USB), so there is no "PAGE DOWN" on the physical keyboard. For the keys module, I've tried control+arrow down, page down, and and space (space bar), but on this particular web page they don't work. However, if I'm browsing it in a normal fashion, control+arrow down and space bar help scrolling the desired section into view. As far as I know, there is no fn + arrow down option that works in Keys, but that's another way that I can move in a normal fashion.
I've run both headless and regular options to try to debug, as well as trying both Firefox and Chrome drivers.
Here's my code:
import time
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
firefox_options = Options()
firefox_options.add_argument('--enable-javascript') # double-checking to make sure that javascript is enabled
firefox_options.add_argument('--headless')
firefox_driver_path = 'geckodriver'
driver = webdriver.Firefox(executable_path=firefox_driver_path, options=firefox_options)
url_link = 'https://www.justwatch.com/us/tv-show/24'
driver.get(url_link) # initial page
cookies = driver.get_cookies()
Examples of things I've tried around this part of the code
various time.sleep(3) and driver.implicitly_wait(3) commands
webdriver.ActionChains(driver).key_down(Keys.CONTROL).key_down(Keys.ARROW_DOWN).perform()
webdriver.ActionChains(driver).key_down(Keys.SPACE).perform()
This code yields a timeout error when used
stream_results = WebDriverWait(driver, 15)
stream_results.until(EC.presence_of_element_located(
(By.CLASS_NAME, "price-comparison__grid__row price-comparison__grid__row--stream")))
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser') # 'lxml' didn't work either
Here's code for getting the html related to the streaming services. I've also tried to grab the html code at various levels, ids, and classes of the tree, but the code just isn't there
stream_row = soup.find('div', attrs={'class':'price-comparison__grid__row price-comparison__grid__row--stream'})
stream_row_holder = soup.find('div', attrs={'class':'price-comparison__grid__row__holder'})
stream_items = stream_row_holder\
.find_all('div', attrs={'class':'price-comparison__grid__row__element__icon'})
driver.quit()
I'm not sure if you are saying your code works in some cases or not at all, but I use chrome and the four find_all() lines at the end all produce results. If this isn't what you mean, let me know. The one thing you may be missing is a time.sleep() that is long enough. That could be the only difference...
Note you need chromedriver to run this code, but perhaps you have chrome and can download chromedriver.exe.
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
url = 'https://www.justwatch.com/us/tv-show/24'
driver = webdriver.Chrome('chromedriver.exe', options=chrome_options)
driver.get(url)
time.sleep(5)
html = driver.page_source
soup = BeautifulSoup(html, "html.parser")
soup.find_all(class_="price-comparison__grid__row__price")
soup.find_all(class_="price-comparison__grid__row__holder")
soup.find_all(class_="price-comparison__grid__row__element__icon")
soup.find_all(class_="price-comparison__grid__row--stream")
This is the output from the last line:
[<div class="price-comparison__grid__row price-comparison__grid__row--stream"><div class="price-comparison__grid__row__title price-comparison__promoted__title"> Stream </div><div class="price-comparison__grid__row__holder"><!-- --><div class="price-comparison__grid__row__element"><div class="presentation-type price-comparison__grid__row__element__icon"><img alt="Hulu" class="jw-provider-icon price-comparison__grid__row__icon" src="https://images.justwatch.com/icon/116305230/s100" title="Hulu"/><div class="price-comparison__grid__row__price"> 9 Seasons <span class="price-comparison__badge price-comparison__badge--hd price-comparison__badge--hd"> HD </span></div></div></div><!-- --></div></div>,
<div class="price-comparison__grid__row price-comparison__grid__row--stream"><div class="price-comparison__grid__row__title"> Stream </div><div class="price-comparison__grid__row__holder"><!-- --><div class="price-comparison__grid__row__element"><div class="presentation-type price-comparison__grid__row__element__icon"><img alt="Hulu" class="jw-provider-icon price-comparison__grid__row__icon" src="https://images.justwatch.com/icon/116305230/s100" title="Hulu"/><div class="price-comparison__grid__row__price"> 9 Seasons <span class="price-comparison__badge price-comparison__badge--hd price-comparison__badge--hd"> HD </span></div></div></div><div class="price-comparison__grid__row__element"><div class="presentation-type price-comparison__grid__row__element__icon"><img alt="IMDb TV" class="jw-provider-icon price-comparison__grid__row__icon" src="https://images.justwatch.com/icon/134049674/s100" title="IMDb TV"/><div class="price-comparison__grid__row__price"> 8 Seasons <!-- --></div></div></div><div class="price-comparison__grid__row__element"><div class="presentation-type price-comparison__grid__row__element__icon"><img alt="DIRECTV" class="jw-provider-icon price-comparison__grid__row__icon" src="https://images.justwatch.com/icon/158260222/s100" title="DIRECTV"/><div class="price-comparison__grid__row__price"> 1 Season <span class="price-comparison__badge price-comparison__badge--hd price-comparison__badge--hd"> HD </span></div></div></div><!-- --></div></div>]

Does using selenium change the token of the session?

i am writing a tool to check mac address online with selenium i managed to find the input and the submit but when i ask for the results it print the session id and the token
import selenium
## set up options
options = Options()
options.headless=True
browser.Firefox(options, exceutable_path=r"geckodriver_path")
browser.get("site-URL")
## mac address sent to site
elem = browser.find_element_by_id('result')
elemnt = browser.find_element_by_css_selector('#results-log')
print (elem)
print (elemnt)
the output is some session info
<selenium.webdriver.remote.webelement.WebElement (session="289e304328d8a7900f7003d4ed6530be", element="f807a2e7-8895-4e8d-b7af-ce3d27fbf897")>
i need to get the result that is on the site
You saw it right.
The variable elem is a WebElement identified through browser.find_element_by_id('result')
The variable elemnt is a WebElement identified through browser.find_element_by_css_selector('#results-log')
Printing the element will be in the following format:
<selenium.webdriver.remote.webelement.WebElement (session="289e304328d8a7900f7003d4ed6530be", element="f807a2e7-8895-4e8d-b7af-ce3d27fbf897")>
You can find a relevant discussion in Are element IDs numbers in Webdrivers?

Selenium creating bulk email addresses

I want to use selenium to create several email addresses at once. I suppose they can be random but I already have a list of the email account names I want to create.
I know how to create 1 email using webdriver but how would I go about it if I want to sign up several, one after the other automatically, without having to always change the code?
Simple code for creating 1 email:
from selenium import webdriver
import time
url = 'https://hotmail.com/'
driver = webdriver.Chrome('/C:Users/Desktop/chromedriver')
driver.get(url)
driver.find_element_by_xpath("//a[contains(#class, 'linkButtonSigninHeader')]/#href").click()
time.sleep(2)
driver.find_element_by_id('MemberName').send_keys('usernameexample')
time.sleep(1)
driver.find_element_by_id('iSignupAction).click()
time.sleet(2)
driver.find_element_by_id('PasswordInput').send_keys('Passwordexample1')
time.sleep(1)
driver.find_element_by_id('iSignupAction').click()
time.sleep(2)
driver.find_element_by_id('FirstName').send_keys('john')
time.sleep(1)
driver.find_element_by_id('LastName').send_keys('wayne')
time.sleep(1)
driver.find_element_by_id('iSignupAction').click()
As others have pointed out, you could iterate over a data collection, such as an array:
array_of_usernames = ['username_one', 'username_two']
array_of_usernames.each do |username|
url = 'https://hotmail.com/'
driver = webdriver.Chrome('/C:Users/Desktop/chromedriver')
driver.get(url)
driver.find_element_by_xpath("//a[contains(#class, 'linkButtonSigninHeader')]/#href").click()
driver.find_element_by_id('MemberName').send_keys("#{username}") #INTERPOLATED BLOCK-LOCAL VARIABLE HERE
driver.find_element_by_id('iSignupAction).click()
driver.find_element_by_id('PasswordInput').send_keys('Passwordexample1')
driver.find_element_by_id('iSignupAction').click()
driver.find_element_by_id('FirstName').send_keys('john')
driver.find_element_by_id('LastName').send_keys('wayne')
driver.find_element_by_id('iSignupAction').click()
# some step to log out so that next username can register
end
If you aren't familiar with arrays or iteration, then I'd suggest looking at the docs to get your head around it: https://ruby-doc.org/core-2.6.1/Array.html#method-i-each

Threading and Selenium

I'm trying to make multiple tabs in Selenium and open a page on each tab simultaneously. Here is the code.
CHROME_DRIVER_PATH = "C:/chromedriver.exe"
from selenium import webdriver
import threading
driver = webdriver.Chrome(CHROME_DRIVER_PATH)
links = ["https://www.google.com/",
"https://stackoverflow.com/",
"https://www.reddit.com/",
"https://edition.cnn.com/"]
def open_page(url, tab_index):
driver.switch_to_window(handles[tab_index])
driver.get(url)
return
# open a blank tab for every link in the list
for link in range(len(links)-1 ): # 1 less because first tab is already opened
driver.execute_script("window.open();")
handles = driver.window_handles # get handles
all_threads = []
for i in range(0, len(links)):
current_thread = threading.Thread(target=open_page, args=(links[i], i,))
all_threads.append(current_thread)
current_thread.start()
for thr in all_threads:
thr.join()
Execution goes without errors, and from what I understand this should logically work correctly. But, the effect of the program is not as I imagined. It only opens one page at a time, sometimes it doesn't even switch the tab... Is there a problem that I'm not aware of in my code or threading doesn't work with Selenium?
There is no need in switching to new window to get URL, you can try below to open each URL in new tab one by one:
links = ["https://www.google.com/",
"https://stackoverflow.com/",
"https://www.reddit.com/",
"https://edition.cnn.com/"]
# Open all URLs in new tabs
for link in links:
driver.execute_script("window.open('{}');".format(link))
# Closing main (empty) tab
driver.close()
Now you can handle (if you want) all the windows from driver.window_handles as usual