I am trying to use this code that works on my local machine on python anywhere and i want to understand if it is even possible:
from selenium import webdriver
from bs4 import BeautifulSoup
import time
# Initialize webdriver
driver = webdriver.Chrome(executable_path="/Users/matteo/Downloads/chromedriver")
# Navigate to website
driver.get("https://apnews.com/article/prince-harry-book-meghan-royals-4141be64bcd1521d1d5cf0f9b65e20b5")
time.sleep(5)
# Parse page source
soup = BeautifulSoup(driver.page_source, "html.parser")
# Find desired elements using Beautiful Soup
elements = soup.find_all("p")
# Print element text
for element in elements:
print(element.text)
# Close webdriver
driver.quit()
Do i need to have installed chrome to make that work or is chromium enough? Because when i run that code on my local machine a chrome page opens up. How does that work on python anywhere? Would it crush?
I am wondering if the code i am using only works if someone is on a GUI with Chrome installed or if it can work on python anywhere too.
The short answer is no. ChromeDriver must have chrome installed. You can run your tests headless for time save, but chrome still must be installed.
Related
I want to collect the information of a webpage using chromedriver. How do I install it and use it?
You have to install selenium first if you don't have it already. Then to use selenium:
from selenium.webdriver import Chrome
url="URL of the webpage you want to read"
setting up the driver
webdriver = "path of the chromedriver.exe file saved in your pc"
driver.get(url)
using css selector
y = driver.find_element_by_css_selector('css selector of the data you want to read from the webpage').text
print(y)
You don't install the chromedriver - you download the .exe (from here) and use the path to it in webdriver.Chrome(). This getting started page has a comprehensive guide:
from selenium import webdriver
driver = webdriver.Chrome('/path/to/chromedriver') # refers to the path where you saved the exe
driver.get('http://www.google.com/');
time.sleep(5) # Let the user actually see something!
search_box = driver.find_element_by_name('q')
search_box.send_keys('ChromeDriver')
search_box.submit()
time.sleep(5) # Let the user actually see something!
driver.quit()
Note: download the .exe that matches with your version of chrome!
(In Help > About Google Chrome)
As mentioned by #Patha_Mondal, you need to download the driver and select the elements you want to read. However, as your original question asks "How to use selenium in pandas to read a webpage?", I would say instead consider using Scrapy along with Selenium to create a ".csv" file from the Webpage Data.
Read the ".csv" data into pandas using pandas.read_csv() .
The data from the Webpage might not be clean or properly formatted. Using Scrapy to create a dataset out of it would be beneficial for reading it into pandas. Avoid using pandas directly in the same script as Selenium and Scrapy.
Hope it Helped.
First, machine and package specs:
I am running:
ChromeDriver version 75.0.3770.140
Selenium: version '3.141.0'
WSL (linux subsystem) of windows 10
I am trying to run a chromebrowser through selenium. I found: these commands, to use selenium through google chrome.
I have a test directory, with only the chromedriver binary file, and the script, in it. The location of the directory is: /home/kela/test_dir/
I ran the code:
import selenium
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
options = Options()
options.binary_location='/home/kela/test_dir/chromedriver'
driver = webdriver.Chrome(chrome_options = options,executable_path='/home/kela/test_dir/chromedriver')
The output from this code is:
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: No matching capabilities found
Can anyone explain why I need capabilities when the same script works for others without capabilities? I did try adding:
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
but I got the same error. So I'm not sure what capabilities I need to add (considering it works for others without it?)
Edit 1: Addressing DebanjanB's comments below:
Chromedriver is in the expected location. I am using windows 10. From here, the expected location is C:\Program Files (x86)\Google\Chrome\Application\chrome.exe; and this is where it is on my machine (I copied and pasted this location from the chrome Properties table).
ChromeDriver is having executable permission for non-root users.
I definitely have Google Chrome v75.0 installed (I can see that the Product version 75.0.3770.100)
I am running the script as a non-root user, as my bash command line ends with a $ and not # (i.e kela:~/test_dir$ and not kela:~/test_dir#)
Edit 2: Based on DebanjanB's answer below, I am very close to having it working, but just not quite.
The code:
import selenium
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.chrome.options import Options
options = Options()
options.binary_location='/c/Program Files (x86)/Google/Chrome/Application/chrome.exe'
driver = webdriver.Chrome(options=options)
driver.get('http://google.com/')
Produces a dialog box that reads:
Google Chrome cannot read and write to it's data directory: /tmp/.com/google.Chrom.gyw63s
So then I double checked my Chrome permissions and I should be able to write to Chrome:
Also, I can see that /tmp/ has a bunch of .com dirs in it:
.com.google.Chrome.4jnWme/ .com.google.Chrome.FdNyKP/ .com.google.Chrome.VAcWMQ/ .com.google.Chrome.ZbkRx0/ .com.google.Chrome.iRrceF/
.com.google.Chrome.A2QHHB/ .com.google.Chrome.G7Y51c/ .com.google.Chrome.WD8BtK/ .com.google.Chrome.cItmhA/ .com.google.Chrome.pm28hN/
However, since that seemed to be more of a warning than an error, I clicked 'ok' to close the dialog box, and a new tab does open in the browser; but the URL is just 'data:,'. The same thing happens if I remove the line 'driver.get('http://google.com')' from the script, so I know the warning/issue is with the line:
driver = webdriver.Chrome(chrome_options = options,executable_path='/home/kela/test_dir/chromedriver')
For example, from here, I tried adding:
options.add_argument('--profile-directory=Default')
But the same warning pops up.
Edit 3:
As edit 3 was starting to veer into a different question than specifically being addressed here, I started a new question here.
This error message...
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: No matching capabilities found
...implies that the ChromeDriver was unable to initiate/spawn a new WebBrowser i.e. Chrome Browser session.
binary_location
binary_location set/get(s) the location of the Chrome (executable) binary and is defined as:
def binary_location(self, value):
"""
Allows you to set where the chromium binary lives
:Args:
- value: path to the Chromium binary
"""
self._binary_location = value
So as per your code trials, options.binary_location='/home/kela/test_dir/chromedriver' is incorrect.
Solution
If Chrome is installed at the default location, you can safely remove this property. Incase Chrome is installed at a customized location you need to use the options.binary_location property to point to the Chrome installation.
You can find a detailed discussion in Selenium: WebDriverException:Chrome failed to start: crashed as google-chrome is no longer running so ChromeDriver is assuming that Chrome has crashed
Effectively, you code block will be:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.binary_location=r'C:\Program Files (x86)\Google\Chrome\Application\chrome.exe'
driver = webdriver.Chrome(options=options, executable_path='/home/kela/test_dir/chromedriver.exe')
driver.get('http://google.com/')
Additionally, ensure the following:
ChromeDriver is having executable permission for non-root users.
As you are using ChromeDriver v75.0 ensure that you have the recommended version of the Google Chrome v75.0 as:
---------ChromeDriver 75.0.3770.8 (2019-04-29)---------
Supports Chrome version 75
Execute the Selenium Test as non-root user.
So I am creating a simple bot to like people's posts, just for proof of concept and learning. I am using pycharm, along with selenium and geckodriver as my driver. As of now, I am simply trying to get the firefox web browser to open.
I have already added geckodriver to my PATH, and my computer recognizes Geckodriver. I have also tried using chromedriver and have it open chrome. The same thing happens, it simply does not open. Therefore, I feel as though there is a problem in my code or my external libraries, but I just can't seem to find it. Am I supposed to have some external library other than Python 3.7?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
class InstagramBot:
def __init__(self, username, password):
self.username = username
self.password = password
self.driver = webdriver.Firefox()
def closeBrowser(self):
self.driver.close()
def login(self):
driver = self.driver
driver.get("https://www.instagram.com/")
time.sleep(2)
# "//a[#href'accounts/login']"
# "//input [#name='username']"
# "//input [#name='password']"
ig = InstagramBot("scumbag_scarbs, Sunny9999")
ig.login()
I expected it to open firefox and navigate to instagram.com. I do not recieve an error, but it just shows the program as running and not doing anything.
Please help, I found a bunch of people with problems similar to mine but nothing exactly the same.
I have been searching on the internet using Selenium (Java) interacting with Google Chrome Extension but have not been able to find an answer.
First Question
Is there a way to launch the chrome extension since Selenium only interact with WebView but not on the chrome extensions button in the browser ?
I try this method
"chrome-extension://id/index.html" but the extension did not launch as expected. I like find if there is another way to launch a chrome extension through selenium
Second Question
I am trying to click on the elements in a chrome extension with Selenium webdriver. How do I do it ? I tried the driver.CurrentWindowHandle , but it does not detect the chrome extension.
Thanks
Below is the solution with pyautogui (similar to autoit in java - so you can extend the same solution for java also).
Pre-Condition:
save the extension image in the project folder (I saved it under "autogui_ref_snaps" folder in my example with "capture_full_screenshot.png" name
Python:
Imports needed
from selenium import webdriver
from selenium.webdriver import ChromeOptions
from Common_Methods.GenericMethods import *
import pyautogui #<== need this to click on extension
Script:
options = ChromeOptions()
options.add_argument("--load-extension=" + r"C:\Users\supputuri\AppData\Local\Google\Chrome\User Data\Default\Extensions\fdpohaocaechififmbbbbbknoalclacl\5.1_0") #<== loading unpacked extension
driver = webdriver.Chrome(
executable_path=os.path.join(chrome_options=options)
url = "https://google.com/"
driver.get(url)
# get the extension box
extn = pyautogui.locateOnScreen(os.path.join(GenericMethods.get_full_path_to_folder('autogui_ref_snaps') + "/capture_full_screenshot.png"))
# click on extension
pyautogui.click(x=extn[0],y=extn[1],clicks=1,interval=0.0,button="left")
If you are loading an extension and it's not available in incognito mode then follow my answer in here to enable it.
Try to click on extension with this JsExecutor method:
driver.execute_script("window.postMessage('clicked_browser_action', '*')")
I know you can run Selenium headless alongside WebDriver, but is there a way to do so with the service? I'm trying the following and it just opens the browser normally, seemingly ignoring Xvfb. This is on a Mac if that happens to matter.
from pyvirtualdisplay import Display
import selenium.webdriver as webdriver;
import selenium.webdriver.chrome.service as service
# ...
self.display = Display(visible=0, size=(1024, 768))
self.display.start()
self.service = service.Service('/path/to/chromedriver');
self.service.start();
# Various Chrome option stuff clipped
browser = webdriver.Remote(self.service.service_url, desired_capabilities=options.to_capabilities());
Fyi -- not a real solution, but for the time being, I'm using Chrome's window-size and window-position flags to keep selenium out of the way e.g.,
options = webdriver.ChromeOptions();
options.add_argument('--window-size=100,100')
options.add_argument('--window-position=100,1200')
browser = webdriver.Remote(service_url, desired_capabilities=options.to_capabilities())
Maybe, this is what you want: generalredneck/headless-selenium