I am trying to fetch some web data using Selenium to run Firefox in headless mode in a Python3 script. I am running the script on Debian.
The code works fine, but is very very slow. Half a minute to get the title of the webpage. I need to use this in a production environment and it won't work if it's that slow. I am new to Selenium so it may be a basic mistake I am making...
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.firefox.options import Options
import time
from decimal import Decimal
start_time = time.time()
fireFoxOptions = webdriver.FirefoxOptions()
fireFoxOptions.set_headless()
browser = webdriver.Firefox(firefox_options=fireFoxOptions)
fetchUrl = "https://www.amazon.com/CakCity-Military-Waterproof-Luminous-Stopwatch/dp/B018HTGSN8/ref=sxin_1_ac_d_rm?ac_md=0-0-d2F0Y2g%3D-ac_d_rm&keywords=watch&pd_rd_i=B018HTGSN8&pd_rd_r=ad451c0c-abfe-4436-bee4-d11f9e1dbb1e&pd_rd_w=J9Zl5&pd_rd_wg=AqMuw&pf_rd_p=ed481207-4bea-4e19-bbad-73ed40fdc292&pf_rd_r=A7B9299YFYD1G0JZ81YH&psc=1&qid=1573502062"
browser.get( fetchUrl )
print ( browser.title )
executionTime = round(time.time() - start_time, 2)
print( "- execution time [" + str( executionTime ) + "]" )
browser.close()
browser.quit()
Execution time varies from 20s to 30s
Debian 9.4 (Stretch)
Firefox 70.0.1
Geckodriver 0.26.0
Selenium 3.141.0
Python 3.5.3
I have tried limiting the libraries I import, and it has no change.
I have tested several different websites, such as just google.com, and the response time is the same. I can load these sites within seconds manually through any browser, so I would think headless mode would be faster. The script does not throw any errors.
Gecko log
1573530232430 mozrunner::runner
INFO Running command: "/usr/bin/firefox" "-marionette" "-headless" "-foreground" "-no-remote" "-profile" "/tmp/rust_mozprofileKfYEBK"
*** You are running in headless mode.
1573530233337 addons.webextension.screenshots#mozilla.org
WARN Loading extension 'screenshots#mozilla.org': Reading manifest: Invalid extension permission: mozillaAddons
1573530233337
addons.webextension.screenshots#mozilla.org
WARN Loading extension 'screenshots#mozilla.org': Reading manifest: Invalid extension permission: telemetry
1573530233338
addons.webextension.screenshots#mozilla.org
WARN Loading extension 'screenshots#mozilla.org': Reading manifest: Invalid extension permission: resource://pdf.js/
1573530233338
addons.webextension.screenshots#mozilla.org
WARN Loading extension 'screenshots#mozilla.org': Reading manifest: Invalid extension permission: about:reader*
1573530237793 Marionette
INFO Listening on port 36801
1573530237852 Marionette
WARN TLS certificate errors will be ignored for this session
[Parent 1707, Gecko_IOThread]
WARNING: pipe error: Broken pipe: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 728
1573530254051
Marionette
INFO Stopped listening on port 36801
I researched all of the warnings listed above with mixed ideas on solutions.
The "pipe error" does not always show up in the logs, seems random. Also unusual as it says chromium, but I am using Firefox...
I am unclear on the permission errors even after reading up on them, some say to just disregard them as not important.
Any suggestions?
Thanks
Related
I'm running a simple CI pipeline on GitLab for a Selenium script headlessly + using webdriver_manager to handle chrome driver binary.
This part is passed:
Get LATEST chromedriver version for None google-chrome
There is no [linux64] chromedriver for browser None in cache
Trying to download new driver from https://chromedriver.storage.googleapis.com/100.0.4896.60/chromedriver_linux64.zip
Driver has been saved in cache [/root/.wdm/drivers/chromedriver/linux64/100.0.4896.60]
But after that I'm getting this error:
WebDriverException: Message: Service /root/.wdm/drivers/chromedriver/linux64/100.0.4896.60/chromedriver unexpectedly exited. Status code was: 127`
What is the problem? Seems like webdriver_manager has a problem by running in CI.
Here is a simple script for reproduce:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
service = Service(executable_path=ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=chrome_options)
driver.get("http://google.com")
driver.find_element('name', 'q').send_keys("Wikipedia")
This is one of the pipelines:
https://gitlab.com/mmonfared/test/-/jobs/2350697126
This is a sample project:
https://gitlab.com/mmonfared/test
I've also opened an issue in webdriver_manager github repo, no answers yet:
https://github.com/SergeyPirogov/webdriver_manager/issues/363
This error message...
WebDriverException: Message: Service /root/.wdm/drivers/chromedriver/linux64/100.0.4896.60/chromedriver unexpectedly exited. Status code was: 127`
...implies that you are executing your tests as the root user.
Deep Dive
As per Chrome doesn't start or crashes immediately
A common cause for Chrome to crash during startup is running Chrome as
root user (administrator) on Linux. While it is possible to work
around this issue by passing --no-sandbox flag when creating your
WebDriver session, such a configuration is unsupported and highly
discouraged. Please configure your environment to run Chrome as a
regular user instead.
Solution
Execute your tests as a non-root user.
I've got a Lambda Function for headless chrome + python selenium deployed with Serverless framework that runs fine locally but crashes on lambda.
Some basic details:
(Driver info: chromedriver=2.41.578700 (2f1ed5f9343c13f73144538f15c00b370eda6706),platform=Linux 4.14.231-180.360.amzn2.x86_64 x86_64)
Chromium Version: 89xx
selenium==3.141.0
Here is how i'm invoking it with selenium:
options = Options()
options.binary_location = '/opt/headless-chromium'
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--single-process')
options.add_argument("--remote-debugging-port=9222")
options.add_argument('--disable-dev-shm-usage')
#'/opt/chromedriver' not found
driver = webdriver.Chrome('/opt/chromedriver', chrome_options=options)
driver.get('https://www.neaminational.org.au/')
body = f"Headless Chrome Initialized, Page title: {driver.title}"
driver.close();
driver.quit();
response = {
"statusCode": 200,
"body": body
}
I'm getting the cryptic Message: unknown error: Chrome failed to start: exited abnormally
(chrome not reachable)
(The process started from chrome location /opt/headless-chromium is no longer running, so ChromeDriver is assuming that Chrome has crashed.).
Now i've tested this on my ubuntu 18 (same chromium binary, same chrome driver, same install selenium version) and it's working fine... so my issue must be with compatibility with the lambda amz linux environment.
Can anyone give me some idea on how i could troubleshoot this? Seems silly to stumble around trying different versions when they all seem compatible with eachother locally.
Any insight appreciated greatly!
I found this to be really helpful:
https://www.youtube.com/watch?v=jWqbYiHudt8
https://github.com/soumilshah1995/Selenium-on-AWS-Lambda-Python3.7
The versions are the following:
RUNTIME=python3.7
SELENIUM_VER=3.141.0
CHROME_BINARY_VER=v1.0.0-55 # based on Chromium 69.0.3497.81
CHROMEDRIVER_VER=2.43 # supports Chrome v69-71
Credits go to Soumil Nitin Shah.
Best,
Ramón
I'm stuck trying to get my Selenium script running on my raspberry Pi 4 running with raspbian.
The script is running fine on my mac.
The problem is with setting up the webdriver. I tried installing several webdrivers, including chromedriver, geckodriver, operadriver and phantomjsdriver.
Whenever I'm trying to run the script (which I of course changed to the corresponding drivers) I'm greeted with the following error:
OSError: [Errno 8] Exec format error: 'operadriver'
Also trying to open the driver directly from the shell is resulting in an error:
pi#raspberrypi:/home/shares/users $ chromedriver
bash: /usr/local/bin/chromedriver: cannot execute binary file: Exec format error
My research found some people who got it to work, but all the posts seemed quite old. Some where suggesting the error is pointing to the cpu architecture which is armv7l/armhf in my case.
So is it at all possible to get Selenium running on a pi these days? Has anyone got this to work?
This error message with operadriver...
OSError: [Errno 8] Exec format error: 'operadriver'
and this error message with chromedriver...
ash: /usr/local/bin/chromedriver: cannot execute binary file: Exec format error
...implies that the respective OperaDriver and the ChromeDriver binaries which were invoked was not in proper format.
On your macos system you have used the following WebDriver variants:
GeckoDriver: geckodriver-v0.26.0-macos.tar.gz
ChromeDriver: chromedriver_mac64.tar.gz
where the WebDriver variants matched the underlying os architecture.
Now as you are using armv7 architecture, you have to download and use the relevant format and version of executables downloading it from WebDriver driver for the Chromium Browser
Note: It is to be noted that from geckodriver v0.24.0 onwards:
Removed
Turned off builds for arm7hf, which will no longer be released but can still be built from the source.
References
You can find a couple of relevant discussions in:
OSError: [Errno 8] Exec format error with GeckoDriver and Selenium on MacOS
WebDriverException: Message: Service /usr/lib/chromium-browser/chromedriver unexpectedly exited on Raspberry-Pi with ChromeDriver and Selenium
I'm trying to make an automated test for my webpage and I'm using Jasmine in tandem with selenium.
When testing on chrome (using chromedriver) I get, unpredictably, the error below. It happens frequently enough that when I run a test suite it hardly ever finishes.
Ive found evidence of this bug but cant find a solid answer: https://bugs.chromium.org/p/chromedriver/issues/detail?id=732 (granted this was for chromium and I'm using chrome)
WebDriverError: no such session
(Driver info: chromedriver=2.21.371459 (36d3d07f660ff2bc1bf28a75d1cdabed0983e7c4),platform=Mac OS X 10.11.5 x86_64)
at WebDriverError (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/error.js:27:10)
at Object.checkLegacyResponse (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/error.js:639:15)
at parseHttpResponse (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/http/index.js:538:13)
at /Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/http/index.js:472:11
at ManagedPromise.invokeCallback_ (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:1379:14)
at TaskQueue.execute_ (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:2913:14)
at TaskQueue.executeNext_ (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:2896:21)
at /Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:2820:25
at /Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:639:7
at process._tickCallback (node.js:369:9)
From: Task: WebElement.isDisplayed()
at Driver.schedule (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/webdriver.js:377:17)
at WebElement.schedule_ (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/webdriver.js:1744:25)
at WebElement.isDisplayed (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/webdriver.js:2110:17)
at driver.findElements.then.error (/Users/XXXXXXX/Documents/sweetmeeting/Test/front_end_testing/spec/dashboard_tester.js:251:34)
at ManagedPromise.invokeCallback_ (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:1379:14)
at TaskQueue.execute_ (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:2913:14)
at TaskQueue.executeNext_ (/Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:2896:21)
at /Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:2775:27
at /Users/XXXXXXX/Documents/sweetmeeting/node_modules/selenium-webdriver/lib/promise.js:639:7
at process._tickCallback (node.js:369:9)
We've also been struggling with this issue for a long time now and recently resolved it so thought I would post here incase it helps someone else.
Turns out for us it was memory related. We run our tests inside a docker container and the docker default dev/shm size is 64mb. Increasing this resolved the "no such session" issue for us.
We use docker compose so just added shm_size: 256M to the docker-compose.yml file.
I recently encountered this exception too. It first appeared to be undeterministic too, but after thorough investigation I realized that it happens deterministrically if you call ChromeDriver.Close() and then tries to FindElement.
In my case, ChromeDriver.Close() was called in an exception handler of a previous test, which happened due to a timing issue. This only affected the next test so it added to the feeling that this issue is flaky. But as I said, my investigation revealed that it is deterministic.
Having said that, this is my experience with that error. Could be that your case is different...
This error message...
WebDriverError: no such session
(Driver info: chromedriver=a.b.c (36d3d07f660ff2bc1bf28a75d1cdabed0983e7c4),platform=Mac OS X 10.11.5 x86_64)
...implies that the ChromeDriver was unable to communicate with the existing Browsing Context i.e. Chrome Browser session.
We have discussed and analyzed this issue within the discussion Issue 732: No such session error - inconsistent problem which appears when running tests for a prolonged period. This error is usually observed after an extended period of executing the Test Suite as follows:
[0127/105308:ERROR:nacl_helper_linux.cc(289)] NaCl helper process running without a sandbox!
Most likely you need to configure your SUID sandbox correctly
[489.849][INFO]: RESPONSE FindElements unknown error: session deleted because of page crash
from tab crashed
(Session info: chrome=p.q.r.s)
[489.849][DEBUG]: Log type 'driver' lost 0 entries on destruction
[489.849][DEBUG]: Log type 'browser' lost 9 entries on destruction
This error is defined in nacl_helper_linux.cc as follows:
// If the Zygote has started handling requests, we should be sandboxed via
// the setuid sandbox.
if (!IsSandboxed()) {
LOG(ERROR) << "NaCl helper process running without a sandbox!\n"
<< "Most likely you need to configure your SUID sandbox "
<< "correctly";
Precisely the FindElement(s) method have FAILED due to sandbox issue and Page Crash occured due to session deletion
Solution
This error can happen due to a lot of diverse reasons and the solution to address this error are as follows:
Initiate the Chrome session configuring ChromeDriver with the argument --disable-impl-side-painting
Additionally, you can also add the argument --enable-gpu-rasterization which allow heuristics to determine when a layer tile should be drawn with the Skia GPU backend. Only valid with GPU accelerated compositing + impl-side painting.
As an option, you can also add the argument --force-gpu-rasterization which always uses the Skia GPU backend for drawing layer tiles. Only valid with GPU accelerated compositing + impl-side painting. Overrides the kEnableGpuRasterization flag.
This error is also observed when the server does not recognize the unique session identifier. This happens if the session has been deleted or if the session ID is invalid in either of the following ways:
Explicit session deletion: A WebDriver session is explicitly deleted when explicitly invoking the quit() method as follows:
from selenium import webdriver
from selenium.common.exceptions import InvalidSessionIdException
driver = webdriver.Chrome(executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
print("Current session is {}".format(driver.session_id))
driver.quit()
try:
driver.get("https://www.google.com/")
except Exception as e:
print(e.message)
#Console Output:
Current session is a9272550-c4e5-450f-883d-553d337eed48
No active session with ID a9272550-c4e5-450f-883d-553d337eed48
Implicit session deletion: A WebDriver session is implicitly deleted when you close the last window or tab invoking close() method as follows:
driver = webdriver.Chrome(executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
print("Current session is {}".format(driver.session_id))
# closes current window/tab
driver.close()
try:
driver.get("https://www.google.com/")
except Exception as e:
print(e.message)
#Console Output:
Current session is a9272550-c4e5-450f-883d-553d337eed48
No active session with ID a9272550-c4e5-450f-883d-553d337eed48
You may also have to add the argument --no-sandbox
Chrome seem to crash often in Docker containers on certain pages due to too small /dev/shm. Similarly, you may have to fix the small /dev/shm size.
An example:
sudo mount -t tmpfs -o rw,nosuid,nodev,noexec,relatime,size=512M tmpfs /dev/shm
It also works if you use -v /dev/shm:/dev/shm option to share host /dev/shm
Another way to make it work would be to add the chrome_options as --disable-dev-shm-usage. This will force Chrome to use the /tmp directory instead. This may slow down the execution though since disk will be used instead of memory.
chrome_options.add_argument('--disable-dev-shm-usage')
Reference
You can find a couple of detailed discussions in:
selenium.common.exceptions.WebDriverException: Message: invalid session id using Selenium with ChromeDriver and Chrome through Python
unknown error: session deleted because of page crash from unknown error: cannot determine loading status from tab crashed with ChromeDriver Selenium
org.openqa.selenium.SessionNotCreatedException: session not created exception from tab crashed error when executing from Jenkins CI server
Sometimes we just expect any problem to be very complex and are looking for its cause far too deep, when a problem could be so obvious.
I was seeing this issue when I explicitly called browser.close() as an exception handler in my logout() method. It terminated the session and all of the following protractor tests were throwing this error.
Once I removed browser.close() and just threw an error instead, the problem was solved.
Running my protractor tests remotely (jenkins) leads to timeout error sometimes. That is not deterministic.
Starting selenium standalone server...
[launcher] Running 1 instances of WebDriver
[launcher] Process exited with error code 1
/opt/jenkins.dir/workspace/my-jenkins-job/integration-test/ui/node_modules/protractor/node_modules/selenium-webdriver/lib/webdriver/promise.js:1761
throw error;
^
Error: Timed out waiting for the WebDriver server at http://10.97.193.53:4455/wd/hub
at Error (<anonymous>)
at onResponse (/opt/jenkins.dir/workspace/my-jenkins-job/integration-test/ui/node_modules/protractor/node_modules/selenium-webdriver/http/util.js:87:11)
at /opt/jenkins.dir/workspace/my-jenkins-job/integration-test/ui/node_modules/protractor/node_modules/selenium-webdriver/http/util.js:42:21
at /opt/jenkins.dir/workspace/my-jenkins-job/integration-test/ui/node_modules/protractor/node_modules/selenium-webdriver/lib/webdriver/http/http.js:96:5
at ClientRequest.<anonymous> (/opt/jenkins.dir/workspace/my-jenkins-job/integration-test/ui/node_modules/protractor/node_modules/selenium-webdriver/http/index.js:145:7)
at ClientRequest.emit (events.js:95:17)
at Socket.socketErrorListener (http.js:1548:9)
at Socket.emit (events.js:95:17)
at net.js:441:14
at process._tickCallback (node.js:448:13)
However when I run the tests locally in my mac there is no problem and the tests run perfectly.
I have tried to start the selenium servers manually in the remote machines and I have realised that sometimes it works immediately and sometimes I have to wait up to one minute.
My question is: Is there any way to tell protractor to wait longer for the webdriver to connect?
Environment details
Machine: Red Hat 4.4.7-11
Protractor version: 1.8.0
Selenium Server Standalone: 2.45.0
You can specify it by using driver.wait function.
var webdriver = require('selenium-webdriver');
var protractor = require('protractor');
var driver = new webdriver.Builder().usingServer("seleniumAddress").build();
var browser = protractor.wrapDriver(driver);
browser.driver.wait(driver.getWindowHandle(), 5000, 'Server should start within 5 seconds');
References :
http://angular.github.io/protractor/#/api?view=webdriver.WebDriver.prototype.wait
http://angular.github.io/protractor/#/api?view=webdriver.WebDriver.prototype.getWindowHandle
Yes, and it should solve your issue. Use the seleniumServerStartTimeout option in your protractor.conf.js file to bump up the timeout from the default 30 seconds to something longer like 90 seconds:
exports.config = {
seleniumServerStartTimeout: 90000
};
I experienced the same issue on a CentOS 7 VM. For whatever reason the selenium server seems to take wildly different amounts of time to start up each time, and can sometimes exceed the default timeout.