I am using Selenium webdriver with firefox. I am wondering if there is a setting i can change such that it to only requesting resources from certain domains. (Specifically i want it only to request content which is on the same domain as the webpage itself).
My current set up, written in Python, is:
from selenium import webdriver
firefox_profile = webdriver.FirefoxProfile()
## Here, I change various default setting in Firefox, and install a couple of monitoring extensions
driver = webdriver.Firefox(firefox_profile)
driver.get(web_address)
What i want to do, is if i specify the web address wwww.domain.com, then to only load content served by domain.com, and not e.g. all the tracking content hosted by other domains that would typically be requested. Hoping could be achieved by a change to the profile settings in firefox, or via an extension.
Note - there is a similar question (without an answer) - Restricting Selenium/Webdriver/HtmlUnit to a certain domain - but it is four years old, and i think Selenium has evolved a lot since then.
With thanks to Vicky, (who's approach of using Proxy settings i followed - although directly from Selenium), the code below will change the proxy settings in firefox such that it will not connect to a domain except that on the white-list.
I suspect several setting changes are unnecessary and can be omitted for most purposes. Code in Python.
from selenium import webdriver
firefox_profile = webdriver.FirefoxProfile()
## replace desired_domain.com below with whitelisted domain. Separate domains by comma.
firefox_profile.set_preference("network.proxy.no_proxies_on","localhost,127.0.0.1,desired_domain.com")
firefox_profile.set_preference("network.proxy.backup.ftp","0.0.0.0")
firefox_profile.set_preference("network.proxy.backup.ftp_port",1)
firefox_profile.set_preference("network.proxy.backup.socks","0.0.0.0")
firefox_profile.set_preference("network.proxy.backup.socks_port",1)
firefox_profile.set_preference("network.proxy.backup.ssl","0.0.0.0")
firefox_profile.set_preference("network.proxy.backup.ssl_port",1)
firefox_profile.set_preference("network.proxy.ftp","0.0.0.0")
firefox_profile.set_preference("network.proxy.ftp_port",1)
firefox_profile.set_preference("network.proxy.http","0.0.0.0")
firefox_profile.set_preference("network.proxy.http_port",1)
firefox_profile.set_preference("network.proxy.socks","0.0.0.0")
firefox_profile.set_preference("network.proxy.socks_port",1)
firefox_profile.set_preference("network.proxy.ssl","0.0.0.0")
firefox_profile.set_preference("network.proxy.ssl_port",1)
firefox_profile.set_preference("network.proxy.type",1)
firefox_profile.set_preference("network.proxy.share_proxy_settings",True)
driver = webdriver.Firefox(firefox_profile)
driver.get(web_address_desired)
I think it is still impossible in selenium.But you can still achieve this by using proxies like browsermob. Webdriver integrates well with browsermob proxy.
Sample pseudeocode in java
//LittleProxy-powered 2.1.0 release
LegacyProxyServer server = new BrowserMobProxyServer();
server.start(0);
// Blacklist websites
server.blacklistRequests("https?://.*\\.blocksite\\.com/.*", 410);//these sites will be blocked
/// get the Selenium proxy object
Proxy proxy = ClientUtil.createSeleniumProxy(server);
// configure it as a desired capability
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(CapabilityType.PROXY, proxy);
// initialize the driver with the capabilities ;
Webdriver driver = new FirefoxDriver(capabilities);
Hope this helps you.Kindly get back if you need any further help
Related
I want to connect to a site with Webdriver, but cloudflare challenge(not hcaptcha) detects selenium as a bot and doesnt pass me through the Cloudflare challenge.
I have used these flags and many similar flags in my code, but I have not been able to bypass yet.
ChromeOptions options=new ChromeOptions();
options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
options.setExperimentalOption("useAutomationExtension", false);
options.addArguments("--disable-blink-features");
options.addArguments("--disable-blink-features=AutomationControlled");
System.setProperty("webdriver.chrome.driver", "drivers/chromedriver.exe");
driver = new ChromeDriver(options);
My chrome version 104.0.5112.81 and chrome driver version is 104.0.5112.79
How can I bypass Cloudflare?
To bypass cloudflare you need a high score here https://antcpt.com/score_detector/ (green) , this is for reCaptcha but is relevant for cloudflare too i think.
Here are some things other that flags you want to try:
Do not use a VPN or TOR , VPN if its paid it can be good but if you use TOR the last node is always public (i am not sure about that , but you cant bypass cloudflare if you use tor)
I dont see in your code if you are changing user agent... i used selenium_stealth in python to change user agent , renderer and such
stealth(driver,
languages=["en-US", "en"],
vendor="Google Inc.",
platform="Win32",
webgl_vendor="Intel Inc.",
renderer="Intel Iris OpenGL Engine",
fix_hairline=True,
)
here is another link to test your driver on https://intoli.com/blog/making-chrome-headless-undetectable/chrome-headless-test.html (there was one with more features but i dont remember the link...)
__ 3. You probably would need to use an existing profile , so it doesnt seem like you are a bot , your current one with a lot of cookies and other data would be good (i am not sure if this actually works , but when in practice it seemed for me that it helped) here is a link to how to load one... How to load default profile in Chrome using Python Selenium Webdriver?
__ 4. Remove the flag from chromedriver.exe $cdc_
__ 5. Probably check this too Can a website detect when you are using Selenium with chromedriver?
Also note that bypassing cloudflare too much will worsen your score if the website will detect bot behaviour.
I am pretty new to using Selenium and it's webdrivers. I have a need to enable DoH (dns over https) together with an option for selecting which DoH server to connect to in chrome driver in Selenium.
I have been researching online and have gone through recommended switches available here: https://peter.sh/experiments/chromium-command-line-switches/
as well as seen a similar post here: How to disable dns over https in selenium for disabling DoH (I don't even have DoH enabled by default in first place in chromedriver), but haven't figured out yet to how to get it enabled in the headless mode.
I also looked at the switches available for firefox driver but still don't see any right away available switches to use for the same.
Any help would be appreciated.
Thanks!
fbw
To enable DoH you need to do the following:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
local_state = {
"dns_over_https.mode": "automatic",
"dns_over_https.templates": "",
}
options = Options()
options.add_experimental_option('localState', local_state)
driver = webdriver.Chrome(options=options)
This will turn on the DoH which looks like this in browser settings on the chrome://settings/security page:
Also you can set "dns_over_https.mode": "automatic" which will set the secure option of DoH configuration:
Unfortunately I failed to figure out ho to use "dns_over_https.templates": "". Documentation says about it:
String containing a space-separated list of DNS over HTTPS templates
to use in secure mode or automatic mode. If no templates are specified
in automatic mode, we will attempt discovery of DoH servers associated
with the configured insecure resolvers.
I'm not familiar with DoH, so this description tells me nothing. I don't know what a DoH template is. I hope you know what they are talking about.
Some js plugins can indeed do. How?
js_code = f"""
var prefs = Components.classes["#mozilla.org/preferences-service;1"].getService(Components.interfaces.nsIPrefBranch);
prefs.setIntPref("network.proxy.type", 1);
prefs.setCharPref("network.proxy.http", "${host}");
prefs.setIntPref("network.proxy.http_port", "${port}");
prefs.setCharPref("network.proxy.ssl", "${host}");
prefs.setIntPref("network.proxy.ssl_port", "${port}");
prefs.setCharPref("network.proxy.ftp", "${host}");
prefs.setIntPref("network.proxy.ftp_port", "${port}");
"""
It is not effict。
I have a more hardcore approach, provided you don't use Headless mode. And they don't care about the speed. You can load a fully fledged proxy plug-in before instantiating WebDriver. LIke SwitchyOmega, when SwitchyOmage is first started, you can choose to go to the second window, and this window is the configuration page of SwitchyOmage, and set its proxy. Then selenium gets the url of the configuration page, closes the tag, and manually launches the plug-in (because Selenium doesn't control it), but later when you want to change the proxy, you can use Selenium to execute JS to open a new tag and go to the URL of the configuration page. Change the proxy IP and port number using selenium control elements. Hahaha, very hardcore!
I've been trying to use Robot Framework to write some cross browser tests.
One of the requirements is that i need to use a proxy to access the website I am testing. Right now, I am trying to launch the safari browser and get it to go through the proxy to reach the website, but I seem to have an issue.
Here is the Robot framework keyword
# ${MY_PROXY} is a variable located elsewhere in the file
Open Safari
${desired_capabilities} = Evaluate selenium.webdriver.DesiredCapabilities.SAFARI
sys, selenium.webdriver
${safari_proxy} = Create Dictionary proxyType MANUAL httpProxy ${MY_PROXY}
sslProxy ${MY_PROXY}
Set To Dictionary ${desired_capabilities} proxy ${safari_proxy}
Create Webdriver Safari desired_capabilities=${desired_capabilities}
So far, i've been receiving this error
SessionNotCreatedException: Message: Capability 'proxy' could not be honored.
Currently using robotframework-seleniumlibrary 4.5.0 with selenium 3.141.0
Does the safari webdriver allow proxies? I can't seem to find much on this topic
Please refer to the attached screenshot, I need to save data from 'Save all as HAR' option automatically.
I doubt that selenium provides a way to interact with the inspector. You might have to do something like inject Javascript into the page that re-submits the request and then saves the response/request data on the page in a hidden element. You could then simply get the text from that hidden element via selenium.
Hope that helps
No, Selenium is not the tool you want to use to do this. JMeter may be useful, or this site suggests using a proxy server to do this for you.
Selenium does not give you the ability to track network traffic. It does however allow you to configure a proxy that the browser will use when it is started up.
S to get around your problem you will need to use something like the BrowserMobProxy:
BrowserMobProxy browserMobProxy = new BrowserMobProxyServer();
browserMobProxy.start();
Proxy proxy = ClientUtil.createSeleniumProxy(browserMobProxy, InetAddress.getLoopbackAddress());
DesiredCapabilities capabilities = DesiredCapabilities.firefox();
capabilities.setCapability(CapabilityType.PROXY, proxy);
WebDriver driver = new FirefoxDriver(capabilities);
You can then perform some actions with Selenium and when you want to collect a Har file perform:
browserMobProxy.getHar();
For more information about the BrowserMobProxy have a look at the BrowserMobProxy Homepage