Executing Feature in isolation, but contained Scenarios in parallel - karate

I have a large-ish and rapidly growing body of karate tests and use the parallel execution to speed up processing, which basically works great in different configurations:
Normal parallel execution (vast majority of tests)
Sequential execution of Scenarios within a Feature (parallel=false) for very few special cases
Completely sequential execution (via separate single-threaded runner, triggered by custom #sequential tag) for things that modify configuration settings, global lookups etc
There's however also a parameterized (Scenario Outline) feature for the basic functionality of many types of global lookups. Currently it runs in the "completely sequential" mode because it affects other tests. But actually the scenarios inside that feature could be executed in parallel (they don't affect each other) as long as the Feature as a whole is executed in isolation (because the tests do affect other Features).
So - is there a way to implement "sequential Features with parallel Scenarios" execution? I admit that this is likely a niche case, but it would speed up tests execution quite a bit in my case.

... and posting this question already got the ideas flowing and pointed me to a possible way to implement this:
private static void runLocalParallel(Builder<?> builder) {
final List<Feature> features = builder.tags("#local_parallel").resolveAll();
for (Feature feature : features) {
builder.features(feature).parallel(8);
}
}
This identifies all features tagged with #local_parallel, iterates over them and executes a parallel runner for each individually. Result handling, report output etc still needs to be implemented in an elegant manner, but that's doable as well.

Yes, indeed an edge case - but it has come up a few times. We've wondered about a way to "bucketize" threads, which means we can do things like say that certain tags have to be run only on particular thread. Come to think of it, that's a good feature request, so I opened one, feel free to comment. https://github.com/karatelabs/karate/issues/2235
In theory if you write some Java glue code that holds a lock, you can call that code before entering any "critical" feature. I haven't tried it, but may be worth experimenting.

Related

Test-Automation using MetaProgramming

i want to learn test automation using meta programming.i googled it could not find any thing.can anybody suggest me some resources where can i get info about "how to use Meta Programming for making test automation easy"?
That's a broad topic and not a lot has been written about it, because of the "dark corners" of metaprogramming.
What do you mean by "metaprogramming"?
As background, I consider metaprogramming to be any activity in which a tool (which we call a "metaprogramming tool") is used to inspect or modify the application software to achieve some effect.
Many people consider "reflection" to be a kind of metaprogramming; other consider (C++-style) templates to be metaprogramming; some suggest aspect-oriented programming.
I sort of agree but think these are weak versions of what you want, because each has severe limits on what it can see or do to source code. What you really want is a metaprogramming tool that has access to everything in your source program (yes, comments too!) Such tools are called Program Transformation Systems (PTS); they work by parsing the source code and operating on the parsed representation of the program. (I happen to build one of these, see my bio). PTSes can then analyze the code accurate, and/or make reliable changes to the code and regenerate valid source with the changes. PS: a PTS can implement all those other metaprogramming techniques as special cases, so it is strictly more general.
Where can you use metaprogramming for testing?
There are at least 2 areas in which metaprogramming might play a role:
1) Collection of information from tests
2) Generation of tests
3) Avoidance of tests
Collection.
Collection of test results depends on the nature of tests. Many tests are focused on "is this white/black box functioning correctly"? Assuming the tests are written somehow, they have to have access to the box under test,
be able to invoke that box in a realistic ways, determine if the result is correct, and often tabulate the results to that post-testing quality assessments can be made.
Access is the first problem. The black box to be tested may not be easily accessible to a testing framework: driven by a UI event, in a non-public routine, buried deep inside another function where it hard to get at.
You may need metaprogramming to "temporarily" modify the program to provide access to the box that needs testing (e.g., change a Private method to Public so it can be called from outside). Such changes exist only for the duration of the test project; you throw the modified program away because nobody wants it for anything but the test results. Yes, you have to ensure that the code transformations applied to make things visible don't change the program functionality.
The second problem is exercising the targeted black box in a realistic environment. Each code module runs in a world in which it assumes data and the environment are "properly" configured. The test program can set up that world explicitly by making calls on lots of the program elements or using its own custom code; this is usually the bulk of a test routine, and this code is hard to write and fragile (the application under test keeps changing; so do its assumptions about the world). One might use metaprogramming to instrument the application to collect the environment under which a test might need to run, thus avoiding the problem of writing all the setup code.
Finally, one might want to record more than just "test failed/passed". Often it is useful to know exactly what code got tested ("test coverage"). One can instrument the application to collect what-got-executed data; here's how to do it for code blocks: http://www.semdesigns.com/Company/Publications/TestCoverage.pdf using a PTS. More sophisticated instrumentation might be used to capture information about which paths through the code have been executed. Uncovered code, and/or uncovered paths, show where tests have not been applied and you arguably know nothing about what the program does, let alone whether it is buggy in a straightforward way.
Generation of tests
Someone/thing has to produce tests; we've already discussed how to produce the set-up-the-environment part. What about the functional part?
Under the assumption that the program has been debugged (e.g, already tested by hand and fixed), one could use metaprogramming to instrument the code to capture the results of execution of a black box (e.g., instance execution post-conditions). By exercising the program, one can then produce (by definition) "correctly produces" results which can be transformed into a test. In this way, one might construct a huge variety of regression tests for an existing program; these will be valuable in verifying the further enhancements to the program don't break most of its functionality.
Often a function has qualitatively different behaviors on different ranges of input (e.g., for x<10, produced x+1, else produces x*x). Ideally one would like to provide a test for each qualitively different results (e.g, x<10, x>=10) which means one would like to partition the input ranges. Metaprogrammning can help here, too, by enumerating all (partial) paths through module, and providing the predicate that controls each path.
The separate predicates each represent the input space partition of interest.
Avoidance of Tests
One only tests code one does not trust (surely you aren't testing the JDK?) Any code consructed by a reliable method doesn't need tests (the JDK was constructed this way, or at least Oracle is happy to have you beleive it).
Metaprogramming can be used to automatically generate code from specifications or DSLs, in relaible ways. Such generated code is correct-by-construction (we can argue about what degree of rigour), and doesn't need tests. You might need to test that DSL expression achieves the functionaly you desired, but you don't have to worry about whether the generated code is right.

Design pattern for dealing with reusage of date in scenarios (BDD)

I would like your suggestion for my scenario:
I am implementing automated tests using bdd technique with Cucumber and Selenium WebDriver tools, what is currently happening is: a lot of scenarios depend of data of each other, so right now I am storing these data in the class I define the steps, so I can use in other scenarios.
But, as the application grows, and the more scenario I get, the more mess my application gets.
Do you have any design pattern, or solution I could use in this case?
As you say, scenarios that depends on data from other scenarios gets complicated and messy. The order of execution is important.
What would happen if you executed the scenarios in a random order? How would that affect you?
My approach would be to work hard on making each scenario independent of each other. If you have a flow like placing an order that is required for preparing a shipment that is required for creating an invoice and so in, then would I make sure sure that the state of the application was set correct before each scenario. That is, executing code that creates the desired state.
That was a complicated way to say that, in order to create an invoice I must first set the application state so that it has prepared a shipment. And possible other things as well.
I would work hard on setting the application to a known state before any scenario is executed. If that means cleaning the database, then I would do that. The goal would be to be able to execute each scenario in isolation.
The functionality in your system may build on each other. That doesn’t mean that the scenarios you use to check that your application still works should build on each other during their execution.
Not sure if this qualifies as a pattern, but it may be a direction to strive for.
Overall you want to be looking at a Separation of concerns and the Single Responsibility Principle.
At the Cucumber level have two 'layers' of responsibility, the Test Script (Feature File + Step Implementation) and model of the system under test. The Step Implementation maps straight onto the model. Its single purpose is binding feature steps to methods. The Model implementation is to model the state of the system under test which includes state persistence. The model should expose its interface in the declarative style over the imperative approach such that we see fooPage.login(); in preference to page.click('login');
On the Selenium WebDriver side of things use the Page Objects Model It is these reusable object that understand the semantics of representing page and would be a third layer.
Layers
- Test Script (Feature File + Java Steps)
- Model of SUT (which persists the state)
- Page Object Model -> WebDriver/Browser
As already pointed out, try to isolate test scenarios from each other regarding data.
Just a few approaches:
Either cleaning the database or restoring the original data before each test scenario is executed will make it; however, that can slow the tests down significantly. If the cleaning action takes around 10 seconds, that makes ~15 additional minutes for 100 tests, and ~3 hours for 1000 tests.
Alternatively, each test could generate and use its own data. The problem here is that many tests could really use the same data, in which case it makes little sense to create those data over and over again, let alone this is something that also takes time.
Yet another option is to tell between read-only tests and read-write tests. The former could make use of the default data, as they are not affected by data dependencies. The latter should deal with specific data to avoid running into conflicts with other test scenarios.
Still, step definitions within a test scenario are likely to depend on the state of the previous step definitions executed as part of that test scenario. So state management is still required somehow. You may need some helper Object in your Model.
Keep in mind that Steps classes are instantiated for every test scenario, along with the Objects created from them. Thus, private instance attributes won't work, unless all the steps used by a test scenario are implemented in the same Steps class. Otherwise think about static variables or try with a dependency injection framework.

What is the practical benefit to putting t.Parallel() at the top of my tests?

The go testing package defines a Parallel() function:
Parallel signals that this test is to be run in parallel with (and only with) other parallel tests.
However when I searched the tests written for the standard library, I found only a few uses of this function.
My tests are pretty fast, and generally don't rely on mutating shared state, so I've been adding this, figuring it would lead to a speedup. But the fact it's not used in the standard library gives me pause. What is the practical benefit to adding t.Parallel() to your tests?
This thread (in which t.Parallel is conceived and discussed) indicates that t.Parallel() is intended to be used for slow tests only; average tests are so fast that any gain from parallel execution would be negligible.
Here are some quotes (only from Russ, but there wasn't much opposition):
Russ Cox [link]:
There is some question about what the right default is.
Most of our tests run so fast that parallelizing them is
unnecessary. I think that's probably the right model,
which would suggest parallelizing is the exception,
not the rule.
As an exception, this can be accommodated having a
t.Parallel()
method that a test can call to declare that it is okay
to run in parallel with other tests.
Russ Cox again [link]:
the non-parallel tests should be fast. they're inconsequential.
if two tests are slow, then you can t.Parallel them if it bothers you.
if one test is slow, well, one test is slow.
This seems to have been brought up first on the golang-dev group.
The initial request states:
"I'd like to add an option to run tests in parallel with gotest.
My motivation comes from running selenium tests where each test is pretty much independent from each other but they take time."
The thread contains the discussion of the practical benefits.
It's really just for allowing you to run unrelated, long-running tests at the same time. It's not really used in the standard library as almost all of the functionality needs to be as fast as possible (some crypto exceptions etc.)
There was further discussion here and the commit is here

Handling test data when going from running Selenium tests in series to parallel

I'd like to start running my existing Selenium tests in parallel, but I'm having trouble deciding on the best approach due to the way my current tests are written.
The first step in of most of my tests is to get the DB into a clean state and then populate it with the data needed for the rest of the test. While this is great to isolate tests from each other, if I start running these same Selenium tests in parallel on the same SUT, they'll end up erasing other tests' data.
After much digging, I haven't been able to find any guidance or best-practices on how to deal with this situation. I've thought of a few ideas, but none have struck me as particularly awesome:
Rewrite the tests to not overwrite other tests' data, i.e. only add test data, never erase -- I could see this potentially leading to unexpected failures due to the variability of the database when each test is run. Anything from a different ordering of tests to an ill-placed failure could throw off the other tests. This just feels wrong.
Don't pre-populate the database -- Instead, create all needed data via Selenium itself. This would most replicate real-world usage, but would also take significantly longer than loading data directly into the database. This would probably negate any benefits from parallelization depending on how much test data each test case needs.
Have each Selenium node test a different copy of the SUT -- This way, each test would be free to do as it pleases with the database, since we are assume that no other test is touching it at the same time. The downside is that I'd need to have multiple databases setup and, at the start of each test case, figure out how to coordinate which database to initialize and how to signal to the node and SUT that this particular test case should be using this particular database. Not awful, but not what I would love to do if there's a better way.
Have each Selenium node test a different copy of the SUT, but break up the tests into distinct suites, one suite per node, before run-time -- Also viable, but not as flexible since over time you'd want to keep going back and even the length of each suite as much as possible.
All in all, none of these seem like clear winners. Option 3 seems the most reasonable, but I also have doubts about whether that is even a feasible approach. After researching a bit, it looks like I'll need to write a custom test runner to facilitate running the tests in parallel anyways, but the parts regarding the initial test data still have me looking for a better way.
Anyone have any better ways of handling database initialization when running Selenium tests in parallel?
FWIW, the app and tests suite is in PHP/PHPUnit.
Update
Since it sounds like the answer I'm looking for is very project-dependent, I'm at least going to attempt to come up with my own solution and report back with my findings.
There's no easy answer and it looks like you've thought out most of it. Also worth considering is to rewrite the tests to use separately partitioned data - this may or may not work depending on your domain (e.g. a separate bank account per node, if it's a banking app). Your pre-population of the DB could be restricted to static reference data, or you could pre-populate the data for each separate 'account'. Again, depends on how easy this is to do for your data.
I'm inclined to vote for 3, though, because database setup is relatively easy to script these days and the hardware requirements probably aren't too high for a small test data suite.

New WebDriver instance per test method?

What's the best practice fort creating webdriver instances in Selenium-webdriver? Once per test method, per test class or per test run?
They seem to be rather (very!) expensive to spin up, but keeping it open between tests risks leaking information between test methods.
Or is there an alternative - is a single webdriver instance a single browser window (excluding popups), or is there a method for starting a new window/session from a given driver instance?
Thanks
Matt
I've found that reusing browser instances between test methods has been a huge time saver when using real browsers, e.g. Firefox. When running tests with HtmlUnitDriver, there is very little benefit.
Regarding the danger of indeterministic tests, it's a trade-off between totally deterministic tests and your time. Integration tests often involve trade-offs like these. If you want totally deterministic integration tests you should also be worrying about clearing the database/server state in between test runs.
One thing that you definitely should do if you are going to reuse browser instances is to clear/store the cookies between runs.
driver.manage().deleteAllCookies();
I do that in a tearDown() method. Also if your application stores any data on the client side you'd need to clear that (maybe via JavascriptExecutor). To your application which is under test, it should look like a completely unrelated request after doing this, which really minimizes the risk of indeterministic behaviour.
If your goal of automated integration testing is to have reproducible tests, then I would recommend a new webdriver instance for every test execution.
Each test should stand alone, independent from any other test or side-effects.
Personally the only thing I find more frustrating than a hard to reproduce bug, is a non-deterministic test that you don't trust.
(This becomes even more crucial for managing the test data itself, particularly when you look at tests which can modify persistent application state, like CRUD operations.)
Yes, additional test execution time is costly, but it is better then spending the time debugging your tests.
Some possible solutions to help offset this penalty is to roll your testing directly into your build process, going beyond Continuous Build to Continuous Integration approach.
Also try to limit the scope of your integration tests. If you have a lot of heavy integration tests, eating up execution time, try to refactor. Instead, increase the coverage of your more lightweight unit tests of the underlying service calls (where your business logic is).