Test Case Optimization - minimizing # of tests based on similarity

Test Case Optimization - minimizing # of tests based on similarity - testing

I have several test cases that I want to optimize by a similarity-based test case selection method using the Jaccard matrix. The first step is to choose a pair with the highest similarity index and then keep one as a candidate and remove the other one.
My question is: based on which strategy do you choose which of the two most similar test cases to remove? Size? Test coverage? Or something else? For example here TC1 and TC10 have the highest similarity. which one will you remove and why?

It depends on why you're doing this, and a static code metric can only give you suggestions.
If you're trying to make the tests more maintainable, look for repeated test code and extract it into shared code. Two big examples are test setup and expectations.
For example, if you tend to do the same setup over and over again you can extract it into fixtures or test factories. You can share the setup using something like setup/teardown methods and shared contexts.
If you find yourself doing the same sort of code over and over again to test condition, extract that into a shared example like a test method or a matcher or a shared example. If possible, replace custom test code with assertions from an existing library.
Another metric is to find tests which are testing too much. If unit A calls units B and C, the test might do the setup for and testing of B and C. This is can be a lot of extra work, makes the test more complex, and makes the units interdependent.
Since all unit A cares about is whether their particular calls to B and C work, consider replacing the calls to B and C with mocks. This can greatly simplify test setup and test performance and reduces the scope of what you're testing.
However, be careful. If B or C changes A might break but the unit test for A won't catch that. You need to add integration tests for A, B and C together. Fortunately, this integration test can be very simple. It can trust that A, B, and C work individually, they've been unit tested, it only has to test that they work together.
If you're doing it to make the tests faster, first profile the tests to determine why they're slow.
Consider if parallelization would help. Consider if the code itself is too slow. Consider if the code is too interdependent and has to do too much work just to test one thing. Consider if you really have to write everything to a slow resource such as a disk or database, or if doing it in memory sometimes would be ok.
It's deceptive to consider tests redundant because they have similar code. Similar tests might test very, very different branches of the code. For example, something as simple as passing in 1 vs 1.1 to the same function might call totally different classes, one for integers and one for floats.
Instead, find redundancy by looking for similarity in what the tests cover. There's no magic percentage of similarity to determine if tests are redundant, you'll have to determine for yourself.
This is because just because a line of code is covered doesn't mean it is tested. For example...
def test_method
call_a
assert(call_b, 42)
end
Here call_a is covered, but it is not tested. Test coverage only tells you what HAS NOT been tested, it CANNOT tell you what HAS been tested.
Finally, test coverage redundancy is good. Unit tests, integration tests, acceptance tests, and regression tests can all cover the same code, but from different points of view.
All static analysis can offer you is candidates for redundancy. Remove redundant tests only if they're truly redundant and only with a purpose. Tests which simply overlap might be serve as regression tests. Many a time I've been saved when the unit tests pass, but some incidental integration test failed. Overlap is good.

Related

How an automation script of 100 test cases should be written using selenium webdriver?

Kindly explain whether I should write only 1 java file for all the test cases or individual java file for single test case

You don't give enough details to give a specific answer, so I try to put down some guiding principles. Those principles are just software design 101, so you might want to do some learning and reading in that direction.
The key question really is: how similar are your tests.
they vary just in the values
You really have just one test, that you put in a loop in order to iterate through all the values. Note that also a behavior can be a value. In this case you could use the Strategy Pattern.
they are variations of the same test idea
You probably want some classes representing the elements of the tests, which then get combined into tests. For example elements might be TestSteps, which then get combined into tests. If the combining is really simple. It might be feasible to put it all in one class, but with 100 tests, that is unlikely.
completely independent tests
You are probably better of to put them in different classes/files. But you probably still find lots of stuff to reuse (for example PageObjects) which should go into separate classes.
In the end I would expect for 100 tests maybe 50 classes: Many test classes, containing 1-20 tests each, that have a lot of stuff they share, plus a healthy does of classes that encapsulate common functionality (PageObjects, Matcher, predefined TestSteps and so on)

According to one source you should used one class per test, but to used inheritance of classes:
Think of each class as a test case, and focus it on a particular aspect (or component) of the system you're testing. This provides an easy way to add new test cases (simply create a new class) and modify and update existing tests (by removing/disabling test methods within a class). It can greatly help organize your test suites by allowing existing tests (e.g., individual methods) to be easily combined together.

What tools exist for managing a large suite of test programs?

I apologize if this has been answered before, but I'm having trouble finding a tool that fits my needs.
I have a few dozen test programs, but each one can be run with a large number of parameters. I need to be able to automatically run sweeps of many of the parameters across all or some of the test programs. I have my own set of tools for running an individual test, which I can't really change, but I'm looking for a tool that would manage the entire suite.
Thus far, I've used a home-grown script for this. The main problem I run across is that an individual test program might take 5-10 parameters, each with several values. Although it would be easy to write something that would just do a nested for loop and sweep over every parameter combination, the difficulty is that not every combination of parameters makes sense, and not every parameter makes sense for every test program. There is no general way (i.e., that works for all parameters) to codify what makes sense and what doesn't, so the solutions I've tried before involve enumerating each sensible case. Although the enumeration is done with a script, it still leads to a huge cross-product of test cases which is cumbersome to maintain. We also don't want to run the giant cross-product of cases every time, so I have other mechanisms to select subsets of it, which gets even more cumbersome to deal with.
I'm sure I'm not the first person to run into a problem like this. Are there any tools out there that could help with this kind of thing? Or even ideas for writing one?
Thanks.
Adding a clarification ---
For instance, if I have parameters A, B, and C that each represent a range of values from 1 to 10, I might have a restriction like: if A=3, then only odd values of B are relevant and C must be 7. The restrictions can generally be codified, but I haven't found a tool where I could specify something like that. As for a home-grown tool, I'd either have to enumerate the tuples of parameters (which is what I'm doing) or put or implement something quite sophisticated to be able to specify and understand constraints like that.

We rolled our own, we have a whole test infrastructure. It manages the tests, has a number of built in features for allowing the tests to log results, the logs are managed by the test infrastructure to go into a searchable database for all kinds of report generation.
Each test has a class/structure that has information about the test, name of test, author, and a variety of other tags. When running a test suite you can run everything or run everything with a certain tag. So if you want to only test SRAM you can easily run only tests tagged sram.
Our tests are all considered either pass or fail. pass/fail criteria is determined by the author of the individual test, but the infrastructure wants to see either pass or fail. You need to define what your possible results are, as simple as pass/fail or you might want to add pass and keep going, pass but stop testing, fail but keep going, and fail and stop testing. Stop testing meaning if there are 20 tests scheduled and test 5 fails then you stop you dont go on to 6.
You need a mechanism to order the tests which could be alphabetical but it might benefit from a priority scheme (must perform the power on test before performing a test that requires the power to be on). It may also benefit from a random ordering some tests may be passing due to dumb luck because a test before them made something work, remove that prior test and this test fails. or vice versa this test passes until it is preceeded by a specific test and those two dont get along in that order.
To shorten my answer I dont know of an existing infrastructure, but I have built my own and worked with home built ones that were tailored to our business/lab/process. You wont hit a home run the first time, dont expect to. but try to predict a managable set of rules for individual tests, how many types of pass/fail return values it can return. The types of filters you want to put in place. The type of logging you may wish to do and where you want to store that data. then create the infrastructure and the mandantory shell/frame for each test, then individual testers have to work within that shell. Our current infrastructure is in python which lent itself to this nicely, and we are not restricted to only python based tests we can use C or python and the target can run whatever languages/programs it can run. Abstraction layers are good, we use a simple read/write of an address to access the unit under test, and with that we can test against a simulation of the target or against real hardware when the hardware arrives. We can access the hardware through a serial debugger, or jtag or pcie, and the majority of the tests dont know or care because the are on the other side of the abstraction.

how to run nunit tests ordered by categories in same namespace, without mix tests?

My question is as follows:
I have a quantity if test cases in the same namespace, these test cases are categorized By A and B categories, I need to run first category A and then Category B.
The problem is, when i select categories (at nunit gui), the tool run the test in the order they appear, not by category as i'm expecting.
For example, I have the test cases:
WhenUserTriesToAddLineItemGroup [Cateogry B]
WhenUserTriesToCreateTopic [Cateogry A]
WhenUserTriesToCreateAreas [Cateogry A]
I need to run first:
WhenUserTriesToCreateTopic and WhenUserTriesToCreateAreas
and second:
WhenUserTriesToAddLineItemGroup
but they are running in the order as they are in the list. Please, how can i do to run the tests in the way I need?
Thanks!!

NUnit is built on the notion that any single test is completely autonomous. If you have things that need to be done before the test can be run, they might be put in a SetUp method, and if your test has any side-effects you might have a TearDown method to undo those side-effects, but otherwise it assumes that any single test can be run at any time, and that test order does not matter. This has several advantages for developers, since you can run any set of your tests without worrying about their prerequisites, and the pass/fail condition of one test does not depend on the pass/fail condition of a different test.
I would suggest that you reconsider why your tests have these restrictions in the first place, and try to rewrite them so that they test more isolated pieces of functionality, and do not have a prerequisite state of your system. In principle you could combine these tests into a single test, so that you can handle the dependencies, but that's not usually a great solution, since it will often lead to very large tests (whereas unit tests are usually testing a very small condition or piece of functionality). It would be better to take a look at what conditions you need to test, and write each test from the ground up to test just that condition, without assuming that your system is in any state or that any other tests have already been run.
Without more specifics about what you are trying to test it's hard to give a concrete suggestion for how to improve your system, but in general you will get the most out of unit testing if each test is completely autonomous.

functional integration testing

Is software testing done in the following order?
Unit testing
Integration testing
Functional testing
I want to confirm if Functional testing is done after Integration testing or not.
Thanx

That is a logical ordering, yes. Often followed by User Acceptance Testing and then any form of public alpha/beta testing before release if appropriate.

In a TDD coding environment, the order in which these tests are made to pass generally follows your order; however, they are often WRITTEN in the reverse order.
When a team gets a set of requirements, one of the first things they should do is turn those requirements into one or more automated acceptance tests, which prove that the system meets all functional requirements defined. When this test passes, you're done (IF you wrote it properly). The test, when first written, obviously shouldn't pass. It may not even compile, if it references new object types that you haven't defined yet.
Once the test is written, the team can usually see what is required to make it pass at a high level, and break up development along these lines. Along the way, integration tests (which test the interaction of objects with each other) and unit tests (which test small, atomic pieces of functionality in near-complete isolation from other code) are written. Using refactoring tools like ReSharper, the code of these tests can be used to create the objects and even the logic of the functionality being tested. If you're testing that the output of A+B is C, then assert that A+B == C, then extract a method from that logic in the test fixture, then extract a class containing that method. You now have an object with a method you can call that produces the right answer.
Along the way, you have also tested the requirements: if the requirements assert that an answer, given A and B, must be the logical equivalent of 1+2==5, then the requirements have an inconsistency indicating a need for further clarification (i.e. somebody forgot to mention that D=2 should be added to B before A+B == C) or a technical impossibility (i.e. the calculation requires there to be 25 hours in a day or 9 bits in a byte). It may be impossible (and is generally considered infeasible by Agile methodologies) to guarantee without a doubt that you have removed all of these inconsistencies from the requirements before any development begins.

How to use automation for testing application involving highly complex calculations?

I want to following things for testing a application involving complex calculations:
How to use test automation tools for testing calculations?(using automation tools like QTP or open source tools)
How to decide coverage while testing calculations, how to design test cases?
Thanks in advance,
Testmann

We had to test some really complex calculations in an application we built. To do this we used a tool call FitNesse, which is a wiki test harness (and open source too). It works really well when you provide it data in a table style format.
We had some code in C# that perform some VERY complex calculations. So what we did is wrote a test harness in FitNesse, and then we supplied it with a lot of test data. We worked very hard to cover all cases, so we utilized a sort of internal truth-table to ensure we were getting every possible combination of data input.
The FitNesse test harness has been invaluable to us as the complexity of the calculations has changed over time due to changing requirements. We've been able to ensure the correctness of the calculations because our FitNesse tests act as a very nice regression suite.

Sometimes, you have to estimate the expected conclusion, and then populate the test case from a run of the program.
It's not so much of a mortal sin, as long as you're convinced it's correct. Those tests will then let you know immediately if a code change breaks the code. Also, if you're testing a subset, it's not that big of a stretch of trust.
And for coverage? Cover every branch at least once (that is, any if or loop statements). Cover every threshold, both sides of it (for integer division that would be -1, 0, and 1 as denominators). Then add a few more in for good measure.

To test existing code, you should assume that the code is (mostly) correct. So you just give it some data, run it and record the result. Then use that recorded result in a test case.
When you do the next change, your output should change too and the test will fail. Compare the new result with what you'd have expected. If there is a discrepancy, then you're missing something -> write another test to figure out what is going on.
This way, you can build expertise about an unknown system.
When you ask for coverage, I assume that you can't create coverage data for the actual calculations. In this case, just make sure that all calculations are executed and feed them with several inputs. That should give you an idea how to proceed.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas