selenium grid execution slows down with time - selenium

I am running multiple data validation test on selenium grid (only chrome browsers) on CentOS stack. I notice that initially the tests complete real quick. However, with time, the execution slows down considerably.
I am trying to validate data from a csv file with data on a web application. I have around 100K records in the csv file. For each record, below are the list of event:
launch remote driver(chrome) instance
Open the web application and login
search for the keywords in the csv file on the application and validate the results(output in csv VS output on the web application)
close the remote driver instance
I have configured 7 nodes using CentOS and each node has 10 browser instances.
Also, I am using ThreadPoolExecutor for submitted each thread. So at any given time, I will have 70 threads running where each thread is a webdriver instance.
I am not sure if this is a code level issue or infrastructure related issue. Can someone point me in the right direction of how I can find the root cause for this slowness and rectify it.
I have tried to monitor system resources for one of the nodes and see that the java process takes around 55% CPU and 10% memory. while each browser takes 10% CPU and 4% memory

Selenium grid will be slow when time increases as selenium grid is running on jvm and it will occupy more memory. There are many factors will affect the performance of the browser like no of browser in a node, node configuration, grid configuration and you web server performance. For better grid performance, you have to restart grid hub and nodes once in a while.

Related

Start a bat file remotely which never returns anything (jmeter-server.bat)

So we are doing distributed testing of our web-app using JMeter. For that you need to have the jmeter-server.bat file running in background as it acts as sort of a listener. The problem arises when one of the slave machine out of 4 restarts due to the load and the test is effectively stuck right there as the master machine expects some output from the 4th machine. Currently the automation is done via ansible-playbooks which are called in Jenkins. There are more or less 15 tests that are downstream to one another. So even if one test is stuck, the time is wasted until someone check on the machines.
Things I've tried so far:
I've tried using the Windows Task Scheduler and kept the jmeter-server.bat to run without any user loggin in, but it starts the bat file in background which in-turn spawns all the child processes in the background as well i.e. starts Selenium Chrome in headless mode.
I've tried adding the jmeter-server.bat in startup and configuring the system to AutoLogon without any password to trigger a session which will call the startup file. But unfortunately the idea was scrapped by IT for being insecure.
Tried using the ansible playbook by using the win_command but it again gets stuck as the batch file never returns anything.
Created a service as well for the bat file, but again the child processes started in background.
The problem arises when one of the slave machine out of 4 restarts due to the load
Instead of trying to work around the issue I would rather recommend finding the root cause and fixing it.
Make sure to follow JMeter Best Practices
Configure Java to take heap dump on failure
Inspect Windows PerfMon and operating system/application logs
Check presence of .hprof files in the "bin" folder of your JMeter installation and see what do they say
In general using Selenium for conducting the load is not recommended, I would rather suggest using JMeter's HTTP Request samplers for that, given you properly configure JMeter to behave like a real browser from the system under test perspective there won't be any difference whether the load comes from HTTP Request samplers or from the real browser.
The same states documentation on the WebDriver Sampler
Note: It is NOT the intention of this project to replace the HTTP Samplers included in JMeter. Rather it is meant to compliment them by measuring the end user load time.

is selenium grid a solution to my problem with executiontimes?

Hi all I am using TestNG framework for selenium webdriver scripts. I run them on Jenkins on two slaves one being windows the other being linux. I have close to 100 test cases and they take 2hrs 40 mins on each machine. I would want to speed up the execution time. will selenium grid be helpful in this case?
No. Selenium grid would not be the solution. Selenium grid can multiply the same action, not taking different actions in parallel.
You should look for the opportunities in test parallelization.
Selenium grid is designed to allow you to run test in parallel
The page says:
Selenium Grid allows us to run tests in parallel on multiple machines, and to manage different browser versions and browser configurations centrally (instead of in each individual test).
However it's not as easy as installing it and connecting up.
You'll need to ensure the rest of your framework and tests are capable of parallel execution. Most important part is to watch out for your test data e.g. if multiple tests rely on the same data source and they try and update it at the same time you'll get flaky results.
You're also capable of running tests in parallel on a local machine without selenium grid. If i were you i'd start with this first.
Typically most machines have the resources to run more than one browser - get it running locally before you go to far down the grid rabbit hole.
Here's a link for testng
Also worth a review is zalenium - a docker image that contains a grid + auto scaling nodes allowing easier browser control on a single machine.

Many process of Google Chrome (32 bit)

When 2 tests are running in Chrome, i have observed that too many Google Chrome(32 Bit) processes are running in Task manager, Is this a correct behavior of Chome Driver
When multiple automated tests are getting executed through Google Chrome you must have observed that there are potentially dozens of Google Chrome processes running which can be observed through Windows Task Manager's Processes tab.
Snapshot:
As per the article SOLVED: Why Google Chrome Has So Many Processes for a better user experience Google Chrome initiates a lot of windows background processes for each tab that have been opened by your Automated Tests. Google tries to keep the browser stable by separating each web page into as many processes as it deems fit to ensure that if one process fails on a page, that particular process(es) can be terminated or refreshed without needing to kill or refresh the entire page.
However, from 2018 onwards Google Chrome was actually redesigned to create a new process for each of the following entities:
Tab
HTML/ASP text on the page
Plugin those are loaded
App those are loaded
Frames within the page
In a Chromium Blog Multi-process Architecture it is mentioned:
Google Chrome takes advantage of these properties and puts web apps and plug-ins in separate processes from the browser itself. This means that a rendering engine crash in one web app won't affect the browser or other web apps. It means the OS can run web apps in parallel to increase their responsiveness, and it means the browser itself won't lock up if a particular web app or plug-in stops responding. It also means we can run the rendering engine processes in a restrictive sandbox that helps limit the damage if an exploit does occur.
As a conclusion, the many processes you are seeing is pretty much in line with the current implementation of google-chrome
Outro
You can find a relevant discussion in How to quit all the Firefox processes which gets initiated through GeckoDriver and Selenium using Python

Tests timeouts (Selenium+Jenkins+Grid)

We've started getting random timeouts, but can not get reasons of that. The tests run on remote machines on amazon using selenium grid. Here is how it is going on:
browser is opened,
then a page is loading, but can not load fully within 120 seconds,
then timeout exeption is thrown.
If I run the same tests localy then everything is ok.
The Error is ordinary timeout exception that is thrown if a page is not loaded completely during the period of time that is set in driver.manage().timeouts().pageLoadTimeout(). The problem is that a page of the site can not be loaded completely within that time. But, When period of time that is set in driver.manage().timeouts().pageLoadTimeout() is finished and, consequently, Selenium possession of a browser is finished, the page is loaded at once. The issue can not be reproduced manually on the same remote machines. We've tried different versions of Selenium standalone, Chromedriver, Selenium driver. Browser is Google Chrome 63. Would be happy to hear any suggestions about reasons.
When Selenium loads a webpage/url by default it follows a default configuration of pageLoadStrategy set to normal. To make Selenium not to wait for full page load we can configure the pageLoadStrategy. pageLoadStrategy supports 3 different values as follows:
normal (full page load)
eager (interactive)
none
Code Sample :
Java
capabilities.setCapability("pageLoadStrategy", "none");
Python
caps["pageLoadStrategy"] = "none"
Here you can find the detailed discussions through Java and Python clients.

How to get the number of idle browsers for a node in selenium grid2

My current setup is 5 nodes with 10 Firefox browsers each, all connected to a hub.
I am running into a problem where I am exhausting the 10 firefox browsers for each node. So any new selenium runs are getting queued up at the Hub and running when any FF browser for a node becomes available.
What I want to do is somehow query the selenium grid2 hub to get the number of free/idle/available browsers before actually running my tests on that particular grid setup. Based on my result I would redirect the tests to another grid setup (on another machine) or may be not even run the tests.
Of course I can add more nodes or even increase the number of browsers that can be handled by each node. But I am looking for an answer which will help me query the Grid and then allow me to decide on what action I can take rather than muscling my way by brute force (bigger server to handle more browser sessions).
I also sense that this may be a feature not implemented by Selenium Grid 2, so was wondering how others have got around this problem.
It provides sessions information from each selenium node in a selenium grid. You can get the session information of each node like this (assume your selenium node listens to port 5555):
$ curl http://<selenium-node>:5555/wd/hub/sessions
You will get a JSON object response like this:
{"value":[],"sessionId":null,"status":0,"hCode":1542413295,"class":"org.openqa.selenium.remote.Response"}
Then you can calculate how many active sessions from the "value" array value on each selenium node when it hits those nodes. Then you know how many left.