How to write customize Downloader Middleware for selenium and Scrapy? - selenium

I am having issue communicating between selenium and scrapy object.
I am using selenium to login to some site, once I get that response I want to use scrape's functionaries to parse and process. Please can some one help me writing middleware so that every request should go through selenium web driver and response should be pass to scrapy.
Thank you!

It's pretty straightforward, create a middleware with a webdriver and use process_request to intercept the request, discard it and use the url it had to pass it to your selenium webdriver:
from scrapy.http import HtmlResponse
from selenium import webdriver
class DownloaderMiddleware(object):
def __init__(self):
self.driver = webdriver.Chrome() # your chosen driver
def process_request(self, request, spider):
# only process tagged request or delete this if you want all
if not request.meta.get('selenium'):
return
self.driver.get(request.url)
body = self.driver.page_source
response = HtmlResponse(url=self.driver.current_url, body=body)
return response
The downside of this is that you have to get rid of the concurrency in your spider since selenium webdrive can only handle one url at a time. For that see settings documentation page.

Related

How to continue request session with cookies in Selenium?

Is it possible to continue a request session in selenium will all its cookies, i have seen many people doing it the other way arround but i have no clue how to do it proper the way i want.
def open_selenium_session(self):
# get or set cookies
driver = get_chromedriver(self.proxy, use_proxy=True, user_agent=True)
driver.get("https://www.instagram.com")
cookies = driver.get_cookies()
for cookie in cookies:
self.session.cookies.set(cookie['name'], cookie['value'])
Incase you have stored the cookies previesly from an active session using pickle:
import pickle
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
You can always set them back as follows:
# loading the stored cookies
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
# adding the cookies to the session through webdriver instance
driver.add_cookie(cookie)
Reference
You can find a couple of detailed discussions in:
org.openqa.selenium.InvalidCookieDomainException: Document is cookie-averse using Selenium and WebDriver
selenium.common.exceptions.InvalidCookieDomainException: Message: invalid cookie domain while executing tests in Django with Selenium

Use JMeter Webdriver (Plugin Selenium) to make HTTP Header Manager and see results in Dynatrace

I'd like to know how can I put the HTTP header manager in JMeter if I use Selenium Webdriver Sampler.
I know that there is the standard tool(HTTP Header Manager) in JMeter but that tool is useful when I use HTTP Request in my test. In this case for testing, I use only WebDriver Sampler with Java 1.8. The goal is to see in dynatrace the tags that I send from JMeter. Is it possible to do that? And if it is the answer is positive, how can I do that? Thanks for your help!
WebDriver Sampler doesn't respect HTTP Header Manager
WebDriver itself doesn't support working with HTTP Headers and the feature is unlikely to be implemented ever
So the options are in:
Use an extension like ModHeader, but in this case you will have so switch from the WebDriver Sampler to JSR223 Sampler. Example code:
def options = new org.openqa.selenium.chrome.ChromeOptions()
options.addExtensions(new File('/path/to/modheaders.crx'))
def capabilities = new org.openqa.selenium.remote.DesiredCapabilities()
capabilities.setCapability(org.openqa.selenium.chrome.ChromeOptions.CAPABILITY, options)
def driver = new org.openqa.selenium.chrome.ChromeDriver(capabilities)
driver.get('http://example.com')
Use a proxy like BrowserMob as the proxy for the WebDriver and configure it to add headers to each intercepted request. Example initialization code (you can put it into the aformentioned JSR223 Sampler somewhere in setUp Thread Group)
def proxy = new net.lightbody.bmp.BrowserMobProxyServer()
def proxyPort = 8080
proxy.setTrustAllServers(true)
proxy.addRequestFilter((request, contents, info) -> {
request.headers().add('your header name', 'your header value')
return null
})
proxy.start(proxyPort)

How to handle authentication popup with Selenium WebDriver

I am trying to handle authentication popup through my selenium test by passing username and password in URL.
I have tried following solutions:
I have tried to send username and password in URL
I have tried handling with alert, it doesn't work.
I have tried solutions provided in - How to handle authentication popup with Selenium WebDriver using Java, almost all other than AutoIT, none of them worked for me
I have a Maven project, I am trying to send url with username and password from project.properties file, which looks like this -
URL = https://username:password#URL
open url code-
WebDriver driver = new ChromeDriver();
driver.navigate.to(URL);
I get below error in browser console:
"there has been a problem with your fetch operation: Failed to execute 'fetch' on 'Window': Request cannot be constructed from a URL that includes credentials"
I am able to handle this using AutoIT script.
The script looks something like this,
WinWaitActive("Sign in")
Sleep(5000)
Send("username")
Send("{TAB}")
Send("password")
Send("{ENTER}")
I run this script through my code,
WebDriver driver = new ChromeDriver();
Runtime.getRuntime().exec("(path)\AutoIt\script.exe");
driver.get(prop.getProperty(URL));
driver.navigate().refresh();

Authentication with selenium (Python)

I have the links to the admin area of my website: it is possible to launch those URIs (links) with selenium (in a given browser) without needing to authenticate previously ? If not, then how could I deal with authentication using selenium ?
Not sure what you mean but you can just use selectors and enter credentials to the authentication fields. i.e.
from selenium import webdriver
driver = webdriver.Firefox()
driver.get(url)
driver.find_element_by_id("IDOFLOGIN").sendKeys("YOUR LOGIN")
driver.find_element_by_id("PASSOFLOGIN").sendKeys("YOUR PASSWORD")
driver.find_element_by_id("login button").click()
# Continue
you can find element not necessarily by ID you can also you class, xpath and so on.

Set headers - capybara mechanize or selenium

In my cucumbers, I need to add a key/value pair to the http headers when I request a page with capybara using the mechanize driver or perhaps the selenium driver.
I'm using capybara 1.1.1 and mechanize 2.0.1 and selenium 2.5.0
But how?
Here are my step definitions:
When /^set some headers$/ do
#set some headers here
visit('/url')
end
Then /^some result$/ do
#check page responds to header
end
Many thanks,
Rim
If you're using Mechanize you should be able to set headers in the request like this:
When /^set some headers$/ do
#set some headers here
page.driver.agent.request_headers = {"X-Header" => "value"}
visit('/url')
end