chromedriver (using WebDriver) has abuilt-in backdoor which informs Javascript webPages taht you are automatically scraping their HTML/JS
I tried to disable that unique character by replacing it within ChromeDriver.exe.
When I do that the scraper breaks. What i changed in Chrome is switching the var key=
'$cdc_asdjflasutopfhvcZLmcfl_' into an equally long string of chars, ints and letters(?). IDK enough about Hexadecimal to say what all this is but this is what some article on Jscript says to do so ChromeDriver doesn't stop when you scrape
driver = webdriver.Chrome(options=options, executable_path=PATH);
{even though I include the path defined and it worked fine a second ago!}
super(WebDriver, self).__init__(DesiredCapabilities.CHROME['browserName'], "goog",
RemoteWebDriver.__init__(
self.start_session(capabilities, browser_profile)
response = self.execute(Command.NEW_SESSION, parameters)
self.error_handler.check_response(response)
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: session not created
from unknown error: Runtime.evaluate threw exception: SyntaxError: Invalid or unexpected token
(Session info: chrome=106.0.5249.61)
Please help! I am scraping a site my company pays for
THat's a tough one. I would suggest reinstalling python and selenium. did you retrieve any rotten virusses?
Related
I'm writing some code using Selenium, and at one point I make 7 requests, all to different websites. For the first one, this works fine. However, for others, I get a session ID error. I think that my browser is configured correctly, as I do get results from the first website. I have tried to put a WebDriverWait in between the requests, but to no avail. I think the websites might be blocking my requests. Does anyone have any idea how to solve this problem?
I'm sorry if this is something stupid or if I'm doing anything wrong, I'm quite new ^^
Thanks in advance!
Traceback (most recent call last):
File "/home/cena/PycharmProjects/Frikandelbroodje/main.py", line 56, in <module>
dirk_price = get_price(dirk_url, dirk_classname)
File "/home/cena/PycharmProjects/Frikandelbroodje/main.py", line 44, in get_price
browser.get(url)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: invalid session id
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729#{#29}),platform=Linux 4.15.0-50-generic x86_64)
invalid session id
The invalid session ID error is a WebDriver error that occurs when the server does not recognize the unique session identifier. This happens if the session has been deleted or if the session ID is invalid.
A WebDriver session can be deleted through either of the following ways:
Explicit session deletion: A WebDriver session is explicitly deleted when explicitly invoking the quit() method as follows:
Code Block:
from selenium import webdriver
from selenium.common.exceptions import InvalidSessionIdException
driver = webdriver.Chrome(executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
print("Current session is {}".format(driver.session_id))
driver.quit()
try:
driver.get("https://www.google.com/")
except Exception as e:
print(e.message)
Console Output:
Current session is a9272550-c4e5-450f-883d-553d337eed48
No active session with ID a9272550-c4e5-450f-883d-553d337eed48
Implicit session deletion: A WebDriver session is implicitly deleted when you close the last window or tab invoking close() method as follows:
Code Block:
driver = webdriver.Chrome(executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
print("Current session is {}".format(driver.session_id))
# closes current window/tab
driver.close()
try:
driver.get("https://www.google.com/")
except Exception as e:
print(e.message)
Console Output:
Current session is a9272550-c4e5-450f-883d-553d337eed48
No active session with ID a9272550-c4e5-450f-883d-553d337eed48
Conclusion
As the first one request works fine but for others you get a session ID error most possibly the WebDriver controled Web Browser is getting detected and hence blocking the next requests.
There are different reasons for the WebDriver controled Web Browser to get detected and simultaneously get blocked. You can find a couple of detailed discussion in:
How does recaptcha 3 know I'm using selenium/chromedriver?
Selenium and non-headless browser keeps asking for Captcha
I got this error message because I was running Selenium in docker and I hadn't mounted enough swap memory, so it would crash after just a few pages.
To fix this, I used the same docker command, but added -v /dev/shm:/dev/shm after docker run.
If you had this
docker run -d -p 5901:5900 -p 127.0.0.1:4445:4444 selenium/standalone-chrome
then change to this
docker run -v /dev/shm:/dev/shm -d -p 5901:5900 -p 127.0.0.1:4445:4444 selenium/standalone-chrome
I found this info here, and here.
Browser page crash may leads to InvalidSessionIdException. Selenium says to us: session deleted because of page crash. Check if your browser page still exists when you got your errors.
Here an example of a traceback of this case:
[2021-06-28 15:05:43,787: ERROR/ForkPoolWorker-2] Message: invalid session id
Traceback (most recent call last):
...
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
from tab crashed
(Session info: chrome=83.0.4103.61)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 580, in find_elements_by_class_name
return self.find_elements(by=By.CLASS_NAME, value=name)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 1007, in find_elements
'value': value})['value'] or []
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidSessionIdException: Message: invalid session id
If you want some technical details, take a look at Chromium sources where you can find string session deleted because of page crash.
I had this problem and the reason was that I wrote url in wrong format - not like this, which is correct:
self.driver.get('https://twitter.com')
But this way:
self.driver.get('twitter.com')
Maybe you had the same issue. If not, just check all the links and make sure that all of them are in format of the correct one
in mycase the issue I executed driver.close() then tried to access current_url property of driver which is already closed
this what my wrong code leads to this error message:
url = 'http://localhost:5000/traning'
webpage = driver.get(url)
time.sleep(2)
driver.close()
return driver.current_url
this returns error:
selenium.common.exceptions.InvalidSessionIdException: Message: invalid
session id
and this is solution just save current url and all data in variable before close driver to return this data
driver = setDriver()
url = 'http://localhost:5000/traning'
webpage = driver.get(url)
current_url = driver.current_url
time.sleep(2)
driver.close()
return current_url
Try this:
driver.get('url')
#some code
current_page = driver.current_url
driver.close()
driver.quit()
time.sleep(10)
driver.get(current_page)
After I put driver.close with driver.quit it solve same issue, as you have
I'm writing some code using Selenium, and at one point I make 7 requests, all to different websites. For the first one, this works fine. However, for others, I get a session ID error. I think that my browser is configured correctly, as I do get results from the first website. I have tried to put a WebDriverWait in between the requests, but to no avail. I think the websites might be blocking my requests. Does anyone have any idea how to solve this problem?
I'm sorry if this is something stupid or if I'm doing anything wrong, I'm quite new ^^
Thanks in advance!
Traceback (most recent call last):
File "/home/cena/PycharmProjects/Frikandelbroodje/main.py", line 56, in <module>
dirk_price = get_price(dirk_url, dirk_classname)
File "/home/cena/PycharmProjects/Frikandelbroodje/main.py", line 44, in get_price
browser.get(url)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: invalid session id
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729#{#29}),platform=Linux 4.15.0-50-generic x86_64)
invalid session id
The invalid session ID error is a WebDriver error that occurs when the server does not recognize the unique session identifier. This happens if the session has been deleted or if the session ID is invalid.
A WebDriver session can be deleted through either of the following ways:
Explicit session deletion: A WebDriver session is explicitly deleted when explicitly invoking the quit() method as follows:
Code Block:
from selenium import webdriver
from selenium.common.exceptions import InvalidSessionIdException
driver = webdriver.Chrome(executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
print("Current session is {}".format(driver.session_id))
driver.quit()
try:
driver.get("https://www.google.com/")
except Exception as e:
print(e.message)
Console Output:
Current session is a9272550-c4e5-450f-883d-553d337eed48
No active session with ID a9272550-c4e5-450f-883d-553d337eed48
Implicit session deletion: A WebDriver session is implicitly deleted when you close the last window or tab invoking close() method as follows:
Code Block:
driver = webdriver.Chrome(executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
print("Current session is {}".format(driver.session_id))
# closes current window/tab
driver.close()
try:
driver.get("https://www.google.com/")
except Exception as e:
print(e.message)
Console Output:
Current session is a9272550-c4e5-450f-883d-553d337eed48
No active session with ID a9272550-c4e5-450f-883d-553d337eed48
Conclusion
As the first one request works fine but for others you get a session ID error most possibly the WebDriver controled Web Browser is getting detected and hence blocking the next requests.
There are different reasons for the WebDriver controled Web Browser to get detected and simultaneously get blocked. You can find a couple of detailed discussion in:
How does recaptcha 3 know I'm using selenium/chromedriver?
Selenium and non-headless browser keeps asking for Captcha
I got this error message because I was running Selenium in docker and I hadn't mounted enough swap memory, so it would crash after just a few pages.
To fix this, I used the same docker command, but added -v /dev/shm:/dev/shm after docker run.
If you had this
docker run -d -p 5901:5900 -p 127.0.0.1:4445:4444 selenium/standalone-chrome
then change to this
docker run -v /dev/shm:/dev/shm -d -p 5901:5900 -p 127.0.0.1:4445:4444 selenium/standalone-chrome
I found this info here, and here.
Browser page crash may leads to InvalidSessionIdException. Selenium says to us: session deleted because of page crash. Check if your browser page still exists when you got your errors.
Here an example of a traceback of this case:
[2021-06-28 15:05:43,787: ERROR/ForkPoolWorker-2] Message: invalid session id
Traceback (most recent call last):
...
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
from tab crashed
(Session info: chrome=83.0.4103.61)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 580, in find_elements_by_class_name
return self.find_elements(by=By.CLASS_NAME, value=name)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 1007, in find_elements
'value': value})['value'] or []
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidSessionIdException: Message: invalid session id
If you want some technical details, take a look at Chromium sources where you can find string session deleted because of page crash.
I had this problem and the reason was that I wrote url in wrong format - not like this, which is correct:
self.driver.get('https://twitter.com')
But this way:
self.driver.get('twitter.com')
Maybe you had the same issue. If not, just check all the links and make sure that all of them are in format of the correct one
in mycase the issue I executed driver.close() then tried to access current_url property of driver which is already closed
this what my wrong code leads to this error message:
url = 'http://localhost:5000/traning'
webpage = driver.get(url)
time.sleep(2)
driver.close()
return driver.current_url
this returns error:
selenium.common.exceptions.InvalidSessionIdException: Message: invalid
session id
and this is solution just save current url and all data in variable before close driver to return this data
driver = setDriver()
url = 'http://localhost:5000/traning'
webpage = driver.get(url)
current_url = driver.current_url
time.sleep(2)
driver.close()
return current_url
Try this:
driver.get('url')
#some code
current_page = driver.current_url
driver.close()
driver.quit()
time.sleep(10)
driver.get(current_page)
After I put driver.close with driver.quit it solve same issue, as you have
Occasionally when G1ANT attempts to open a program (Google Chrome) it gives an error "element not visible", it does not happen often, in fact very rarely.
When it does happen it's at the start of the script on the below line,
the URL is a standard HTTP URL
selenium.open chrome url ♥Url
It seems as though it is not recognizing the program chrome at that moment. The error message is:
element not visible (Session info: chrome=78.0.3904.97)
(Driver info: chromedriver=2.38.552522 (437e6fbedfa8762dec75e2c5b3ddb86763dc9dcb),
platform=Windows NT 10.0.14393 x86_64)
Do you know what causes this and is there something I can do to stop it happening?
Here's how you can workaround this issue:
♥elementNotVisibleCount = 0
label elementNotVisible
♥elementNotVisibleCount = ♥elementNotVisibleCount + 1
if ⊂♥elementNotVisibleCount>=5⊃
selenium.open chrome url ♥url
end if
selenium.open chrome url ♥url errorjump elementNotVisible if
⊂♥elementNotVisibleCount<4⊃
If an exception occurs, the robot will jump to the elementNotVisible label and try again maximum 4 times before it will try it the last time and if it fails it will finally throw the exception.
Hope, it'll help you.
version:
firefox : Mozilla Firefox 61.0
geckodriver : geckodriver v0.20.1
I only tried below code:
from selenium import webdriver
browser = webdriver.Firefox()
But getting an error as below:
Traceback (most recent call last):
File "my.py", line 3, in <module>
browser = webdriver.Firefox()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
keep_alive=True)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 156, in __init__
self.start_session(capabilities, browser_profile)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 245, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status: 1
And geckodriver.log:
1528101123327 geckodriver INFO geckodriver 0.20.1
1528101123336 geckodriver INFO Listening on 127.0.0.1:43481
1528101124336 mozrunner::runner INFO Running command: "/usr/bin/firefox" "-marionette" "-profile" "/tmp/rust_mozprofile.y93GPXwtXuKC"
Running Firefox as root in a regular user's session is not supported. ($XAUTHORITY is /home/username/.Xauthority which is owned by username.)
It's only makes a problem in root account , Please help..
This error message...
Running Firefox as root in a regular user's session is not supported. ($XAUTHORITY is /home/keti/.Xauthority which is owned by keti.)
...implies you were either trying to invoke Firefox Browser as a root user or running Firefox Browser as root in a non-root session.
As per User's Firefox process runs as root (if root is running Firefox) both the cases are not supported and should have been relatively difficult to achieve. But technically it was still possible (as the --new-instance and --no-remote flags are available to control remote control) but X11's permissive security model meant an user should basically treat the user account as if it had passwordless sudo.
There were a couple of issues associated as follows:
If a user runs Firefox as root but using their own home directory, many things become broken for that user, sometimes permanently.
When firefox is running as root, other users on the same display can gain root privileges
With the GA (General Availability) of Firefox v60.0 Mozilla Team decided to Disallow Firefox from running as sudo as:
Use clone() instead of fork() for sandboxed Linux processes and remove SandboxEarlyInit etc.
Earlier running sudo firefox, which previously seemed to work but was unsupported, now will fail to load content (tab crash on any page) on most Linux Distributions and it will fail to start and print a message as:
Running Firefox as root in a regular user's session is not supported. ($XAUTHORITY is /home/username/.Xauthority which is owned by username.)
OS: OSX 10.12.2
Selenium Version: 2.52.0 Scrapy
Browser: Chrome
Browser Version: 55.0.2883.95 (64-bit)
Hi,
I'm trying to use selenium on my project. But I'm getting No such session error. when I use it with latest chrome driver. Also You can find error below.
Traceback (most recent call last):
File "/Users/user/Library/Python/2.7/lib/python/site-packages/twisted/internet/defer.py", line 651, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/user/Downloads/Test-2/ecommerce_bot/ecommerce_bot/spiders/hepsiburada.py", line 67, in parseProductComments
self.browser.get(response.url)
File "/Users/user/Library/Python/2.7/lib/python/site-packages/selenium/webdriver/remote/webdriver.py", line 248, in get
self.execute(Command.GET, {'url': url})
File "/Users/user/Library/Python/2.7/lib/python/site-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/Users/user/Library/Python/2.7/lib/python/site-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
raise exception_class(message, screen, stacktrace)
WebDriverException: Message: no such session
(Driver info: chromedriver=2.27.440174 (e97a722caafc2d3a8b807ee115bfb307f7d2cfd9),platform=Mac OS X 10.12.2 x86_64)
Also This my code.
self.browser.get("url")
xpath = self.browser.find_element_by_xpath("/html/head/script[17]")
And Browser
def __init__(self):
super(HepsiburadaSpider, self).__init__()
chromedriver = "/Users/user/Downloads/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
self.browser = webdriver.Chrome(chromedriver)
This was happening to me. I found that downgrading my local version of Chrome to 53.0.2785.116 enabled testing to happen with Protractor (again). This is independent of the version specified in standalone.
Not a great solution (as this is your local version and it will wipe your browser history), but until the bug detailed below is addressed, the one that will allow local testing with Protractor and Chrome.
http://www.slimjet.com/chrome/google-chrome-old-version.php
Apparently there is a know bug with web driver-manager that does not allow it to be updated to 2.24
https://github.com/angular/webdriver-manager/issues/93
Protractor itself has a config file so make sure the version you are using has chromedriver using version 2.23 at the very highest. I am using protractor 3.1.0 giving me
"webdriverVersions": {
"selenium": "2.53.1",
"chromedriver": "2.23",
"iedriver": "2.51.0"
}
Because of this bug your local Chrome version will be too far ahead of what the webdriver-manager can support or is expecting (in this case creating a session).
If the browser is brought to an older version of Chrome it will be necessary to prevent Google updates as Chrome will attempt to move to the most current version every time the browser is reopened.