Why isn't the --browser-test flag included in Puppeteer's default list of Chromium flags? - chromium

I'm just curious if there's any known unwanted effect of this flag on automation, or if it can make my tests less valid.
I'm currently running tests with this flag and it doesn't seem to hurt anything. Is it just overlooked?
https://peter.sh/experiments/chromium-command-line-switches/#browser-test
https://github.com/GoogleChrome/puppeteer/blob/master/lib/Launcher.js#L38

The --browser-test activates an internal test used by the Chromium developers regarding canvas repaints.
Some older code in the repository, gives this hint
Tells Content Shell that it's running as a content_browsertest.
And this issue in the Chromium repository contains more information:
We need a test that checks canvas capture happens for N times when there are N repaints. This test is not appropriate for webkit layout tests as it is slow and there are mock streams involved.
Looks like they added a special flag for this test.
Therefore, you should not activate this flag as this test is about internal browser tests by the developer team not about testing websites.

Related

Behat in Multiple Browsers in Parallel

We currently use Behat 3 to automate BDD tests for our website.
The current setup uses Jenkins to run Selenium which attaches to Firefox and uses XVFB to render (this allows us to save screenshots when anything goes wrong).
This is great for testing that the site (including JavaScript) works and that a user can perform each documented task successfully.
I am looking to expand our testing facilities, and one thing I would like to add is the ability to check multiple browsers. This is very important as we get occasional quirks that can break functionality.
Since the tests currently take slightly over an hour to run (and we have 4 suites for that site on Jenkins), I'd preferably like to run all the browsers at the same time. If I can't find a way to do it concurrently, then I likely will just set up multiple Behat profiles and run each one in series.
One thing I've been looking at as a possible solution is Ghostlab. This would allow us to test across, multiple browsers and multiple devices, including mobile, at the same time. The problem is that I can't find a way of joining this to Behat in a meaningful way.
I could run one browser connected to Ghostlab, which would cause the same actions to be taken across all connected browsers, however, were a browser other than the one controlled by Selenium to break, I do not know how we would capture that information.
TL;DR: Is there any way for me to run BDD (preferable Behat) tests across multiple browsers in parallel, and capture information from any browser that fails?
This is what multi-configuration jobs (or matrix jobs) are designed for in Jenkins.
You specify your job configuration once, but add one or more variables that should change each time, building a matrix of combinations (in your case, the matrix has one dimension: browser).
Jenkins then runs one main build with multiple sub-builds in parallel — one for each combination in the matrix. You can then clearly see the results for each combination.
This requires that your test job can be parameterised, i.e. you can choose at runtime which browser should be run, rather than running all tests together in a single job.
The Jenkins wiki has minimal documentation on this feature, but there are a few good blog posts (and Stack Overflow questions) out there on how to set it up.
A matrix job will use all available "executors" in Jenkins, to run builds in parallel as much as possible.
In a default Jenkins installation, there are two executors availble, but you can change this, or extend Jenkins by adding further build machines.

How to display a short test report/counters in travis-ci?

I mean, it would be very useful if I can see how many tests passed/failed just by one line, without reading build logs.
I use karma as test runner. It have a lot of reporter, but which one should I use?
Example from TeamCity:
This seems like a useful feature but the current user interface doesn't seem to support it.
You can file it as a feature request on Travis CI's GitHub page using the link below:
https://github.com/travis-ci/travis-ci/issues
Although Travis CI doesn't have its own interface for counting the number of tests passed, they do work with CodeClimate, which has it's it's interface and metrics for test coverage. It shows overall test coverage for the whole project and coverage for each file. There's some more info on that here, though it looks like their free version allows local testing only.
There are other tools out there for tracking and analyzing coverage as well, including Coveralls, which is pretty good as well. They have a free version for open source, like Travis CI, so that's can be a plus. They also show coverage as a percent and file-by-file.

BDD with Manual Tests?

We are switching from a classic 'Waterfall' model into more Agile-orient philosophy. We decided to give BDD a try (Cucumber), but we have some issues with migrating some of our 'old' methodologies. The biggest question mark is how manual tests integrates into the cycle.
Let's say the Project Manager defined the Feature and some basic Scenario Outlines. With the test team, we defined around 40 Scenarios for this feature. Some are not possible to automatically test, which means they will have to be tested manually. Execute manual testing when all you have is the feature file, feels wrong. We want to be able to see past failure rate of tests for example. Most of the Test-Cases managers support such features, but they can't work with Feature files. Maintaining the Manual Testcases in external Test-Case manager, will cause never-ending updating issues between the Feature file and the Test-Case manager.
I'm interested to hear if anyone is able to cover this 'mid-ground' and how.
This is not a very unusual case. Even in Agile it may not be possible to automate every scenario. The scrum teams I am working with usually tag them as #manual scenario in the feature file. We have configured our automation suite (Cucumber - Ruby) to ignore these tags while running nightly jobs. One problem with this is, as you have mentioned, we won't know what was the outcome of manual tests as the testers document the results locally.
My suggestion for this was to document the results of each iteration in a YML or any other file format that suits the purpose. This file should be part of the automation suite and should be checked in the repository. So to start with you have results documented along with the automation suite. Later when you have the resource and time, you can add a functionality to your automation suite to read this file and generate a report either with other automation results or separately. Until then your version control should help you to track all previous results.
Hope this helps.
To add to #Eswar's answer, if you're using Cucumber (or one of it's siblings), one option would be to execute the test runner manually and include prompts for the tester to check certain aspects. They then pass/fail the test according to their judgement.
This is often useful for aesthetic aspects e.g. cross-browser rendering, element alignment, correct images used, etc.
As #Eswar mentioned, you can exclude these tests from your automated runs by tagging them.
See this article for an example.
Test cases that cannot be automated are a poor fit for a cucumber test. We have a bunch of these edge cases. It is nigh impossible to get Selenium to verify PDF documents well. Same thing for CSV downloads (not impossible, but not worth the effort). Look and feel tests simply require human eyes at this point. Accessibility testing with screen readers is best done manually as well.
For that, be sure to record the acceptance criteria in the user story in whichever tool you use to track work items. Write a manual test case. The likes of Azure DevOps, Jira, IBM Rational Team Concert and their ilk have ways to record manual test plans, link them to stories, and record the results of executing a manual test.
I would remove the manual test cases from the cucumber tests, and rely on the acceptance criteria for the story, and link the story to some sort of manual test case, be it in a tool or a spreadsheet.
Sometimes you just need to compromise.
We use Azure DevOps with Test Plans + some custom code to synchronize cucumber tests to ADO. I can describe how we’ve realized it in our projects:
We start with the cucumber file first. Each User Story has its own Feature file. The scenarios in the Feature are the acceptance criteria for the story. We end up with lots of Feature files, so we use naming conventions and folders to organize them.
We annotate the top of the Feature file with a tag to the User Story, eg #Story-1234
We‘ve written a command line utility that reads the cucumber files with these tags. It then fetches all the Test Suites in the Test Plan that are linked to Stories. In ADO, a story can only be linked to a single test suite. If a Test Suite doesn’t exist for that Story, our tool creates one.
For each Scenario, the tool creates a an ADO Test Case and then annotates the Scenario with the Test Case ID. This creates amazing traceability for each User Story as the related Test Cases are automatically linked to the Story in the Azure DevOps UI
Although we don’t do this, we could populate the TestCase with the step definitions from our cucumber Scenario. It’s a basic XML structure that describes the steps to take. This would be useful if we wanted to manually execute the test case using the Azure DevOps Test Case UI. Since we focus primarily on automation, we rely on the steps in our Feature files and our ADO Test Cases end up being symbolic links back to cucumber Scenarios.
Because our cucumber tests are written in C# (SpecFlow), we can get the full class name and method for the cucumber test code. Our tool is able to update the Azure DevOps Test Case with the automation details.
Any test case that isn’t ready for automation or must be done manually, we annotate the Scenario with a #ignore or #manual tag.
Using Azure DevOps Pipelines, we use the Visual Studio Test task to run our tests. The important point here is we execute the Test Plan option. This option fetches the Test Cases in the Test Plan that have automation and then executes the specific cucumber tests. The out-of-the-box functionally updates the Test Case statuses with the test results.
After running through automation, we use the Test Plan Report in Azure DevOps which shows the Test Case execution status over time and can distinguish between test automated and manual test cases.
We execute any remaining manual test cases to complete the Test Plan
For us, we often found that the manual cases that cannot be automated are exception cases, or cases that depend on external environment (for example malformed data, network connection not available, maintenance, first time guide...). These cases require special setup to simulate the environment when they happen.
Ideally, I believe it is possible to cover everything, given that you are prepared to go as far as you can to make it happen. But in reality, it is most often too much an effort needed that we prefer the hybrid approach of mixed manual-automatic test cases. We do, however, try to convert those exception cases over the time to automatic ones, by setting up separate environment to simulate exception cases and write automation tests against them.
Nevertheless, even with that effort, there would be cases when it's impossible to simulate, and I believe they should be covered by technical tests from engineers.
You could use an approach similar to the following example:
http://concordion.org/Example.html
When you use a build or continuous integration system to track your test runs, you could add simple specifications / tests for your manual cases that contain a text comparison (e.g. "pass" or "fail"). Then you would need to update the spec after each manual test run, check it in, and start the tests in your build / continuous Integration system. Then the manual results would be recorded together with the results of the automated test execution.
If you would use a tool like Concordion+ (https://code.google.com/p/concordion-plus/) you could even write a summary specification, which could contain scenarios for each of your manual tests. Each one would be reported as individual test result in your test execution environment.
Cheers
taking screen shots seems to be a good idea, you can still automate the verification but will need to go an extra mile. for instance when using Selenium you can add Sikuli(NB: u can't run headless test) to compare results (images) or take a screenshot with Robot (java.awt) use OCR to read text and assert or verify(TestNG)

How to perform integration tests with multiple steps

I've read a lot of questions about multiple asserts in tests. Some are against it and some think it's OK. But I'm starting to wonder how I should do it with longer tests that have many steps.
For example this test with an Android device:
Start wifi
Install app
Uninstall app
Stop wifi
run test a couple of times
As I want to run it multiple times and always in this order it has to be a single test(?). So then I'm forced to do four asserts on the way:
Check that wifi is on.
Check that the app got installed.
Check that the app got uninstalled.
Check that wifi is off.
test is OK
Is this wrong or ugly? I don't see how I could get away from it without splitting up the test and as I see it as a single test case it also seems wrong.
From what I understand from the description: yes, this is wrong because of this part
always in this order
A good unit test is isolated (not dependent on other tests) and its results are not dependent on a particular order of execution. This is important because many frameworks simply have no guarantee to the order of execution.
I think you can split that test up in multiple tests. Keep in mind that in order to test something you might have to change the state prior to it (which is what you do with starting/stopping WIFI) so this is something hard to overcome.
This could be your layout of tests:
StartWifi
StopWifi
InstallApp_WithWifiStarted_InstallsSuccesfully
InstallApp_WithoutWifiStarted_AbortsInstallation
and continue like this for uninstall (I'm not sure what the requirements for that are).
With these tests you will now have knowledge of the following:
The wifi service can be started
The wifi service can be stopped
Installing the app with wifi works
Installing the app without wifi doesn't work
Whereas with your single test you could only deduce from a failure that something went wrong throughout the line but it's unclear where. The problem could have been located at
Starting wifi
Installing app
Uninstalling app
Stopping wifi
With separate, smaller tests you can rule out the ones that aren't applicable because they work themselves.
[At this point I notice you changed the tag from unit-testing to integration-testing]
It's important to note though that what you do isn't bad per sé: larger units are good to test as well although, as you indicate yourself, this is where you're getting close to integration testing.
It's important that you use unit-testing and integration-testing as a complementary testing method: by having these smaller unit tests and your bigger integration test, you can verify that the smaller parts work and that the combination of them works.
Conclusion: yes, having several asserts in your test is okay but make sure you also have smaller tests to test the independent units.
Yes, it's fine to use multiple asserts in a single test. Your test is an integration test and it looks like an acceptance test, and it is normal for those (which exercise a big part of the system) to have many assertions. There should only be one block of assertions, however.
To illustrate that, here are the four tests I think you need to test the functionality you're testing (considering only happy paths):
Test that the wifi can be turned on.
Turn the wifi on.
Assert that the wifi is on.
Turn the wifi off.
Test that the wifi can be turned off:
Turn the wifi on.
Turn the wifi off.
Assert that the wifi is off.
Test that the application can be installed:
Turn the wifi on.
Install the application
Assert that the application is installed.
Turn the wifi off.
Uninstall the application (if you need to do that to clean up).
Test that the application can be uninstalled:
Turn the wifi on.
Install the application.
Uninstall the application.
Assert that the application is uninstalled.
Turn the wifi off.
Each test tests only one action. It might take multiple language-level assertions to test that that action did everything it was supposed to; that's fine. The point is that there's only one block of assertions, and it's at the end of the test (not counting cleanup steps). Tests that need setup code don't need to assert anything about whether that setup code succeeded; that was already done in another test. Likewise, actions that are used in cleanup steps (the steps which follow the assertions) are tested in one place and don't need to be tested again when they're used for cleanup. Each action is tested in one place. The result is that you only need to read one test to find out how a piece of functionality should behave, and you're more likely to need to change only one test if the way that functionality should behave changes.

How should feature toggles be set in tests run in continuous integration?

How does one go about testing when using feature toggles? You want your development computer to be as close to production as possible. From videos I watched, feature toggles are implemented in a way to allow certain people to "use" the feature (i.e., 0 to 100 % of users, or selected users, etc.).
To do continuous integration correctly, would you have to use the same feature toggle settings as the production servers when it comes to testing? Or better yet, if the feature is not off on production, make sure it's on when it comes to running automated tests in the build pipeline? Do you end up putting feature toggles in your testing code, or write tests in a new file? When is the new feature a mandatory step in a process that must occur for system tests?
In a team of more than a few people that uses feature toggles routinely, it's impractical to do combinatorial testing of all toggles or even to plan testing of combinations of toggles that are expected to interact. A practical strategy for testing toggled code has to work for a single toggle without considering the states of the other toggles. I've seen the following process work fairly well:
Because we move all code to production as soon as possible, when a toggle is initially introduced into a project, new tests are written to cover all toggled code with the toggle on. Because we test thoroughly, tests for that code with the toggle off already exist; those tests are changed so that the toggle is explicitly off. Toggled code can be developed behind the toggle as long as is necessary.
Immediately before the toggle is turned on in production, all tests (not just the tests of the toggled code, but the application's entire test suite) are run with the toggle on. This catches any breakage due to unforeseen interactions with other features.
The toggle is turned on in production
The toggle is removed from the code (along with any code that is active only when the toggle is off) and the code is deployed to production
This process applies both to cases where a toggle only hides completely new functionality (so that there is no code that runs only when the toggle is off) and to cases where a toggle selects between two or more versions of the code, like a split test.
To answer a couple of your specific points:
Whether the tests of different toggled states go in the same file or a different file depends on the size of the toggled feature. If it's a small change, it's easiest to keep both versions in the same file. If it's a complete rewrite of a major feature, it's probably easier to have one or more new test files devoted to the new state of the toggle. The number of files affected by the toggle also depends on your test architecture. For example, in a Rails project that uses RSpec and Cucumber a toggle might require new tests in Cucumber features (acceptance/integration tests), routing specs, controller specs, and model specs, and, again, tests of the toggle at each level might be in the same file or a different file depending on the size of the toggled feature.
By "system tests" I think you mean manual tests. I prefer to not have those. Instead, I automate all tests, including acceptance tests, so what I wrote above applies to all tests. Leaving that aside, the new state of the toggle becomes law once temporarily when we run all the tests with the toggle on before turning it on in production, and then permanently when we remove the toggle.