I'm working on a bot that scrapes data from sites but the cloudflare hcaptcha is getting in the way.
Here is my current code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
#Vairables
PATH = "C:\Program Files (x86)\chromedriver.exe"
WEBPAGE = "https://fame.omaze.com/6717638115418?title=Win%20a%20Tesla%20Model%20X®%20Plaid&handle=tesla-model-x-plaid-2022&variant_id=39865096831066&variant_price=$2&combo_campaign_id=6723156017242&combo_title=Win%20%24100%2C000%20to%20Fund%20Your%20Future&combo_variant_id=39883746836570"
driver = webdriver.Chrome(PATH)
driver.get(WEBPAGE)
Related
enter image description hereI made a parser which is working OK in Pycharm, have no troubles and debagger also shows nothing. But when I make an .exe file (I am using pyinstaller to do that) I have some trash in CMD. Script is working OK but this trash is looking awful. Maybe someone can tell me where is the problem
I used this earlier,
from webdriver_manager.chrome import ChromeDriverManager
but now my imports are looking like that:
import csv
import time
from time import sleep as pause
from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
from selenium import webdriver
import urllib.request
So I'm trying to scrape some forecasting data from a website periodically, and ideally I would like for it to happen in the background. I had a look at some documentation and came up with the following code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome(options = options)
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome()
driver.get('https://www.windguru.cz/53')
WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#forecasts-page")))
\#Scraping block of code goes here
driver.quit()
I think the following line is over-riding the --headless argument but i'm not sure.
WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#forecasts-page")))
The reason I have it added in the first place is that the website I'm scraping isn't just static html (have a look for yourself, link is in code). I think there's some js that prompts the forecast data to load, so I need to wait a bit and make sure before the script starts scraping the dom.
Any idea how I can achieve this and run the browser in headless mode?
The line that's opening the chrome is this
driver = webdriver.Chrome()
You can do away with this and code should still work the same.
Trying to run selenium script with targeted chrome profile. But once I run the script, it won't start with the targeted profile but with a new profile. Here's my code:
# import selenium common driver
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
import time
# for specified chrome profile
from selenium.webdriver.chrome.options import Options
# wait page until targeted element loaded
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
# undetectable module
import undetected_chromedriver.v2 as uc # use pip install undetected-chromedriver
if __name__ == '__main__':
options = uc.ChromeOptions()
# another way to set profile is the below (which takes precedence if both variants are used
options.add_argument(r'--user-data-dir=C:\Users\Fadli\AppData\Local\Google\Chrome\User Data\Profile 4')
# just some options passing in to skip annoying popups
options.add_argument(r'--no-first-run --no-service-autorun --password-store=basic')
driver = uc.Chrome(options=options) # version_main allows to specify your chrome version instead of following chrome global version
driver.get('https://nowsecure.nl')
Try the below line if it help
# another way to set profile is the below (which takes precedence if both variants are used
options.add_argument(r'--user-data-dir=C:\Users\Fadli\AppData\Local\Google\Chrome\User Data')
options.add_argument(r'--profile-directory=Profile 4')
you can check it here for more reference
When the ChromeDriver version does not match my current chrome version, i upgrade chromedriver by the following code:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
Then i use selenium for scraping the website data, but still i got some errors. Anyone can help me with this issue? Appreciate.
import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options)
driver.get("https://www.binance.com/cn/futures/funding-history/0")
time.sleep(5)
The errors took place when the above code is ran which has been attached.
You may try to include the below code to get the latest version automatically
through PIP :
pip install chromedriver-autoinstaller
Usage
Just type import chromedriver_autoinstaller in the module you want to use chromedriver.
Example
from selenium import webdriver
import chromedriver_autoinstaller
chromedriver_autoinstaller.install() # Check if the current version of chromedriver exists
# and if it doesn't exist, download it automatically,
# then add chromedriver to path
driver = webdriver.Chrome()
driver.get("http://www.python.org")
assert "Python" in driver.title
Read more about auto upgrade here
Is there a command to run selenium tests without using a framework? e.g. pytest foo_test.py
What would be required on my local machine in order to run the following test? I am confused as this appears the only requirement would be chromedriver but I don't know which command to use in order to execute the actual test.
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
capa = DesiredCapabilities.CHROME
capa["pageLoadStrategy"] = "none"
driver = webdriver.Chrome(desired_capabilities=capa)
wait = WebDriverWait(driver, 20)
driver.get('http://stackoverflow.com/')
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#h-top-questions')))
driver.execute_script("window.stop();")
Here is the Answer to your Question:
As you have asked Is there a command to run selenium tests without using a framework, the Answer is Yes.
To answer in simple words, there exists certain frameworks like pytest, unittest, etc in python to structure your test execution and interpreting the test results. Each of the frameworks have their own strengths. When the code base becomes bulky frameworks helps us to arrange. But using framework is not mandatory.
About your code, I don't see any significant error in your code but working with Selenium 3.x.x you need to download the chromedriver from here and save it in your machine. While you initialize the WebDriver instance you need to mention the absolute path of the chromedriver as below.
Here is your own code with some simple tweaks which works well at my end:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
capa = DesiredCapabilities.CHROME
capa["pageLoadStrategy"] = "none"
driver = webdriver.Chrome(desired_capabilities=capa,executable_path="C:\\your_directory\\chromedriver.exe")
wait = WebDriverWait(driver, 20)
driver.get('http://stackoverflow.com/')
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#h-top-questions')))
driver.execute_script("window.stop();")
Let me know if this Answers your Question.
There are actual 2 requirements that you are using. Selenium itself is a requirement, and then the chromedriver as you mentioned. The file is just a python file, so you can run it by doing python foo_test.py. There is also the option to use a framework like Unittest, which can be useful for seeing test results.
Selenium itself is not a "testing framework", it is a library of commands that allow a user to interact with a web browser. Selenium can be used for webscraping or automating tasks as well as testing purposes.