Distributed selenium grid and http proxy - selenium

I have seen many questions about using Selenium behind proxy where selenium nodes are connecting to internet via proxy. The solution is indicated everywhere is to specify proxy settings in the code when creating the webdriver instance.
Unfortunately in my case this is not going to work, as I am using a distributed selenium grid where different nodes require different proxy settings. When a test is run, the test running only communicates with the grid hub and does not have any control over what node it will run over - thus setting proxy from inside the test is not possible. Each node is a linux machine with both Firefox and Chrome running in virtual framebuffer. Presently the grid has about 25 nodes distributed across multiple data centers, but this number may grow to anywhere up to 1000 in the future.
There are business reasons for such a setup - and I am not in a position (both technically and politically) to change them.
Is there any way to set proxy on a node level and have it apply to everything that's happening on that node only?

Apparently, all I need to do is to define http_proxy and https_proxy environment variables, which chrome will then honour.
For firefox, proxy parameters can be added to /etc/firefox-$version/pref/firefox.js where $version can be determined by running firefox -v | awk '{print substr($3,1,3)}'.

Related

Docker swarm - selenium VNC port - how to make it distinct?

I'm coming from VMs background and with each having a different IP there's no issue connecting to a specific node in a group on a VNC port.
With containers, looking at https://github.com/SeleniumHQ/docker-selenium/blob/master/README.md , "Version 3 with Swarm support"
I can see that I can publish a port for a service corresponding to a specific container image, but I think that'd be a single value for a number of replicas.
So, if I use, say, 20 containers and each container suffixed "debug" exposes VNC on port 5900, how can I access a specific container I want that I assume is identified within an output of a Jenkins job, which sends a selenium test script to one of the nodes on the grid?
I.e. if there's an issue with the test script and I see a container identifier, how can I access that specific container over VNC to see what's going on there? Since there's a single host IP for multiple containers, they need to have different ports published externally vs 5900 to be distinguishable, but I don't see how this can be done in docker-compose/swarm. Is this doable?
As an alternative, would that be any easier with Kubernetes rather than docker swarm? (I have not done much research on it yet)

ERR_PROXY_CONNECTION_FAILED when using squid proxy for connection

I have a squid proxy container on my local Docker for Mac (datadog/squid image). Essentially I use this proxy so that app containers on my local docker and the browser pod (Selenium) on another host use the same network for testing (so that the remote browser can access the app host). But with my current setup, when I run my tests the browser starts up on the remote host and then after a bit fails the test. The message on the browser is ERR_PROXY_CONNECTION_FAILED right before it closes. So I assume that there is an issue with my squid proxy config. I use the default config and on the docker hub site it says
Please note that the stock configuration available with the container is set for local access, you may need to tweak it if your network scenario is different.
I'm not really sure how my network scenario is different. What should I be looking into for more information? Thanks!

How to support multiple environments/DNS in Selenium Grid?

I have a suite of automated tests that run on a Selenium Grid. I have a need now to run these tests in multiple environments (QA, Stage, Production). The environments will be set up using a different DNS server for each one. So a test targeting the QA environment should use the QA DNS, Stage tests should use the Stage DNS, etc.
Ideally, I would like my test suite (which runs in Jenkins and accepts a parameter for which environment to target) to be able to tell the grid to allocate a node, set its DNS servers to (whatever), run the test, then put the DNS servers back the way it found them.
I don't see anything in Selenium's documentation about changing DNS settings on the individual nodes. I also tried looking for browser capabilities that could handle this, but no luck there either. What's the cleanest way to make this happen?
EDIT: The requirement to switch DNS servers is a new one, so there's currently no method in place (manual or automatic) for doing it. Before using this DNS-based method of differentiating environments, we were using environment-specific hostfiles, and switching between them with a custom service that listened on each node for a hostfile-switch request. We might have to create a similar service for switching DNS settings, but I was hoping there was something more "official" than that.
We worked around this issue by setting up a proxy server for each environment, and configuring the proxy servers to use the environment-specific DNS settings. Selenium permits setting a proxy on the individual nodes, so this was a way to programatically modify those settings.

spy-js not capturing from chrome behind a corporate firewall with a proxy

I am not able to capture any information when starting spy-js on Intellij IDEA 15. I think I'm being defeated by proxies. My setup is as follows:
system:
OS X version 10.10.5
network:
I work behind a corporate firewall with proxy servers. These proxies are set at the system level and also at the bash shell level. HTTP_PROXY, http_proxy, http.proxy etc are all set.
web server: My web server runs on a remote system within the corporate network over port 80
gulp:
I run gulp serve-debug to start my local development web proxy. I use browsersync. So I have my website at localhost:3000, and this maps to some-corp-location/ui
spy-js config:
At the moment, it looks like this (though I've thrashed about and tried many things)
I followed the advice I've found to check chrome://net-internals/#proxy and it never shows the spy-js proxy settings, it always only shows the corporate proxy settings. This is why I'm pretty sure I'm getting burned by proxies. I tried looking at the chrome proxy settings to see if I could disassociate it with the system settings, but it wasn't clear if this would work out or not.

Does Selenium Grid handsover the Node after connection?

As shown in this diag:
All the connections from the Selenium Tests(client) should go directly to Selenium HUB, then it will forward the request to an appropriate Node, and return the response.
But what i am observing, that after finding an appropriate Node, the client is trying to communicate directly to the Node.
But in case, the nodes are in a private network and are accesible only by the Selenium HUB and NOT ACCESSIBLE by the Selenium Tests(client) then the subsequent calls fails.
Any idea on how to force all the subsequent calls through the Selenium HUB only?
EDIT
The problem might be something different. My hub is running on 192.168.0.100(with another ip as 10.0.0.2).
So when i am connecting to 192.168.0.100 from my .Net RemoteWebDriverClient, after connecting to the appropriate node, it is using the another ip of the client(10.0.0.2) which is not accessible from my system.
The answer is NO, it doesn't. The Grid remain active throughout the connection.
The ip 10.0.0.2 was of the same selenium HUB machine only. The .net & java implementations of selenium RemoteWebDriver clients were switching to the location header parameter after the initial handshake. This is may be due to the .Net and Java HTTPClient implementations.