how to run nunit tests ordered by categories in same namespace, without mix tests? - testing

My question is as follows:
I have a quantity if test cases in the same namespace, these test cases are categorized By A and B categories, I need to run first category A and then Category B.
The problem is, when i select categories (at nunit gui), the tool run the test in the order they appear, not by category as i'm expecting.
For example, I have the test cases:
WhenUserTriesToAddLineItemGroup [Cateogry B]
WhenUserTriesToCreateTopic [Cateogry A]
WhenUserTriesToCreateAreas [Cateogry A]
I need to run first:
WhenUserTriesToCreateTopic and WhenUserTriesToCreateAreas
and second:
WhenUserTriesToAddLineItemGroup
but they are running in the order as they are in the list. Please, how can i do to run the tests in the way I need?
Thanks!!

NUnit is built on the notion that any single test is completely autonomous. If you have things that need to be done before the test can be run, they might be put in a SetUp method, and if your test has any side-effects you might have a TearDown method to undo those side-effects, but otherwise it assumes that any single test can be run at any time, and that test order does not matter. This has several advantages for developers, since you can run any set of your tests without worrying about their prerequisites, and the pass/fail condition of one test does not depend on the pass/fail condition of a different test.
I would suggest that you reconsider why your tests have these restrictions in the first place, and try to rewrite them so that they test more isolated pieces of functionality, and do not have a prerequisite state of your system. In principle you could combine these tests into a single test, so that you can handle the dependencies, but that's not usually a great solution, since it will often lead to very large tests (whereas unit tests are usually testing a very small condition or piece of functionality). It would be better to take a look at what conditions you need to test, and write each test from the ground up to test just that condition, without assuming that your system is in any state or that any other tests have already been run.
Without more specifics about what you are trying to test it's hard to give a concrete suggestion for how to improve your system, but in general you will get the most out of unit testing if each test is completely autonomous.

Related

Multiple asserts and multiple actions on integration test

I have a integration test that should test the creation of a new account in a CRM software.
The account creation triggers several things:
Creates the basic profile of the company
Creates every user (you can define the number of users on the registration)
Initialize the basic configuration of the account
Sends a welcome email with the starting information
etc
The test checks every aspect with several asserts, but I don't know if this is correct or if I should do a separate test for every one.
If I go for separate tests, the setup would be the same for all, so I feel like it would be a waste of time.
What you explain there sounds more like an end-to-end test. It's ok to have some end-to-end tests, but they are usually very expensive to write, to maintain, and brittle.
For me, the tests in a service should give you confidence that the software you are delivering will work in production. So maybe it's ok to have a very small number of end-to-end tests that check that everything is glued together properly, but most of the actual functionality should be in normal tests. An example of what I would try to avoid is to is have an end-to-end test that checks what happens when a downstream service is down.
Another very important aspect is that tests are written for other developers, they are not written for the compiler, so keeping a tests simple is important for maintainability. I want to stress this because if a test has 10 lines of assertions, that will be unreadable for most developers. even a test of 10 lines of code is difficult to grok.
Here's how I try to build services:
If you are familiar with ATDD and hexagonal architecture, most of the features should be tested stubbing the adaptors, which allows the tests to run super fast and fiddle with the adapters using test doubles. These tests shouldnt interact with anything outside the JVM, and give one a good level of confidence that the features will work. If the feature has too many side effects, I try to pick the assertions carefully. For example if a feature is to create an account, I won't check that the account is actually on the DB (because the chances of that breaking are minuscle), but I would check that all messages that need to be triggered are sent. Sometimes I do create multiple tests if the test starts to become unclear. For example one tests that checks the returned value and another tests that verifies the side effects (e.g. messages being produced).
Having as minimum a good coverage of the critical code with unit tests and integration tests (here I mean test classes that interact with external services) builds up the confidence that the classes work as expected. So end-to-end tests don't need to cover the miriad of combinations.
And last a very small number of end-to-end tests to ensure everything is glued together nicely.
Bottom line: create multiple test with the same setup if it helps understanding the code.
edit
About integration tests: It's just terminology. I call integration test a class or small group of classes that interact with an external service (database, queue, files, etc); A component test is something that verifies a single service or module; and end-to-end test something that tests all the services or modules working together.
What you mentioned about stored procs changes the approach. Do you have unit tests for them? Otherwise you could write some sort of integration tests that verify the stored procs work as expected.
About readability of the test: for me, the real test is to ask someone from another team or a product owner and ask them if the test name, the setup, what is asserted and the relatioship between those things is clear. If they struggle, it means that the test should be simplified.

Test Case Optimization - minimizing # of tests based on similarity

I have several test cases that I want to optimize by a similarity-based test case selection method using the Jaccard matrix. The first step is to choose a pair with the highest similarity index and then keep one as a candidate and remove the other one.
My question is: based on which strategy do you choose which of the two most similar test cases to remove? Size? Test coverage? Or something else? For example here TC1 and TC10 have the highest similarity. which one will you remove and why?
It depends on why you're doing this, and a static code metric can only give you suggestions.
If you're trying to make the tests more maintainable, look for repeated test code and extract it into shared code. Two big examples are test setup and expectations.
For example, if you tend to do the same setup over and over again you can extract it into fixtures or test factories. You can share the setup using something like setup/teardown methods and shared contexts.
If you find yourself doing the same sort of code over and over again to test condition, extract that into a shared example like a test method or a matcher or a shared example. If possible, replace custom test code with assertions from an existing library.
Another metric is to find tests which are testing too much. If unit A calls units B and C, the test might do the setup for and testing of B and C. This is can be a lot of extra work, makes the test more complex, and makes the units interdependent.
Since all unit A cares about is whether their particular calls to B and C work, consider replacing the calls to B and C with mocks. This can greatly simplify test setup and test performance and reduces the scope of what you're testing.
However, be careful. If B or C changes A might break but the unit test for A won't catch that. You need to add integration tests for A, B and C together. Fortunately, this integration test can be very simple. It can trust that A, B, and C work individually, they've been unit tested, it only has to test that they work together.
If you're doing it to make the tests faster, first profile the tests to determine why they're slow.
Consider if parallelization would help. Consider if the code itself is too slow. Consider if the code is too interdependent and has to do too much work just to test one thing. Consider if you really have to write everything to a slow resource such as a disk or database, or if doing it in memory sometimes would be ok.
It's deceptive to consider tests redundant because they have similar code. Similar tests might test very, very different branches of the code. For example, something as simple as passing in 1 vs 1.1 to the same function might call totally different classes, one for integers and one for floats.
Instead, find redundancy by looking for similarity in what the tests cover. There's no magic percentage of similarity to determine if tests are redundant, you'll have to determine for yourself.
This is because just because a line of code is covered doesn't mean it is tested. For example...
def test_method
call_a
assert(call_b, 42)
end
Here call_a is covered, but it is not tested. Test coverage only tells you what HAS NOT been tested, it CANNOT tell you what HAS been tested.
Finally, test coverage redundancy is good. Unit tests, integration tests, acceptance tests, and regression tests can all cover the same code, but from different points of view.
All static analysis can offer you is candidates for redundancy. Remove redundant tests only if they're truly redundant and only with a purpose. Tests which simply overlap might be serve as regression tests. Many a time I've been saved when the unit tests pass, but some incidental integration test failed. Overlap is good.

How to include *_test.go files in HTML coverage reports

I would like to know if there is a way that I can generate an HTML coverage report that also includes statements covered on the tests themselves.
Regarding the merits of doing such a thing, I would like to see that my tests are as useful as the rest of my code. I've become accustomed to including my test code coverage in python and this is something I find helpful.
Update for clarification:
People seem to think I'm talking about testing my tests. I'm not. I just want to see that the statements in my tests are definitely being hit in the HTML coverage report. For example, code coverage on a function in my application might show me that everything's been hit, but it won't necessarily show me that every boundary has been tested. Seeing statements lit up in my test sources show me that I wrote my test well enough. Yes, better factored code shouldn't be so complex as to need that assurance, but sometimes things just aren't better.
I'm not sure I understand the reasoning behind this.
Unit tests, especially in Go, should be simple and straight-forward enough that by just reading them you should be able to spot if a statement is useless.
If that is not the case, maybe you are implementing your unit tests in a way that is too complicated?
If that is the case, I can recommend checking table-driven tests for most cases (not suited for most concurrency-heavy code or methods that depend a lot on manipulating the state, though) as well as trying out TDD (test-driven development).
By using TDD, instead of building your tests in order to try to cover all of your code, you would be writing simple tests that simply validate the specs of your code.
You don't write tests for your tests. Where does it end at that point if you do? Those tests for tests aren't covered. You'll need to write tests for your tests for your tests. But wait! Those tests for your tests for your tests don't have coverage so you better write tests for your tests for your tests for your tests.

Can it be considered bad practice to rely on automated tests running in order?

In my acceptance test suites specifically I see a lot of tests designed to run in a particular order (top to bottom) which in some ways makes sense for testing a particular flow, but I've also heard this is bad practice. Can anyone shed some light on the advantages and drawbacks here?
In majority situations if you rely on the order, there is something wrong. It's better to fix this because:
Tests should be independent to be able to run them separately (you should be able to run just 1 test).
Test-running tools often don't guarantee the order. Even if today it's a particular sequence, tomorrow you could add some configuration to the runner and the order will change.
It's hard to determine what's wrong from the test reports since you see a lot of failures while there is only 1 test that failed.
Again - from the test report tools it's not going to be easy to track the steps of the tests because these steps are assigned to different tests.
You won't be able to run them in parallel if you'd need to (hopefully you don't).
If you want to share logic - create reusable classes or methods (see this).
PS: I'd call these System Tests, not Acceptance Tests - you can write acceptance tests on unit or component levels too.

What tools exist for managing a large suite of test programs?

I apologize if this has been answered before, but I'm having trouble finding a tool that fits my needs.
I have a few dozen test programs, but each one can be run with a large number of parameters. I need to be able to automatically run sweeps of many of the parameters across all or some of the test programs. I have my own set of tools for running an individual test, which I can't really change, but I'm looking for a tool that would manage the entire suite.
Thus far, I've used a home-grown script for this. The main problem I run across is that an individual test program might take 5-10 parameters, each with several values. Although it would be easy to write something that would just do a nested for loop and sweep over every parameter combination, the difficulty is that not every combination of parameters makes sense, and not every parameter makes sense for every test program. There is no general way (i.e., that works for all parameters) to codify what makes sense and what doesn't, so the solutions I've tried before involve enumerating each sensible case. Although the enumeration is done with a script, it still leads to a huge cross-product of test cases which is cumbersome to maintain. We also don't want to run the giant cross-product of cases every time, so I have other mechanisms to select subsets of it, which gets even more cumbersome to deal with.
I'm sure I'm not the first person to run into a problem like this. Are there any tools out there that could help with this kind of thing? Or even ideas for writing one?
Thanks.
Adding a clarification ---
For instance, if I have parameters A, B, and C that each represent a range of values from 1 to 10, I might have a restriction like: if A=3, then only odd values of B are relevant and C must be 7. The restrictions can generally be codified, but I haven't found a tool where I could specify something like that. As for a home-grown tool, I'd either have to enumerate the tuples of parameters (which is what I'm doing) or put or implement something quite sophisticated to be able to specify and understand constraints like that.
We rolled our own, we have a whole test infrastructure. It manages the tests, has a number of built in features for allowing the tests to log results, the logs are managed by the test infrastructure to go into a searchable database for all kinds of report generation.
Each test has a class/structure that has information about the test, name of test, author, and a variety of other tags. When running a test suite you can run everything or run everything with a certain tag. So if you want to only test SRAM you can easily run only tests tagged sram.
Our tests are all considered either pass or fail. pass/fail criteria is determined by the author of the individual test, but the infrastructure wants to see either pass or fail. You need to define what your possible results are, as simple as pass/fail or you might want to add pass and keep going, pass but stop testing, fail but keep going, and fail and stop testing. Stop testing meaning if there are 20 tests scheduled and test 5 fails then you stop you dont go on to 6.
You need a mechanism to order the tests which could be alphabetical but it might benefit from a priority scheme (must perform the power on test before performing a test that requires the power to be on). It may also benefit from a random ordering some tests may be passing due to dumb luck because a test before them made something work, remove that prior test and this test fails. or vice versa this test passes until it is preceeded by a specific test and those two dont get along in that order.
To shorten my answer I dont know of an existing infrastructure, but I have built my own and worked with home built ones that were tailored to our business/lab/process. You wont hit a home run the first time, dont expect to. but try to predict a managable set of rules for individual tests, how many types of pass/fail return values it can return. The types of filters you want to put in place. The type of logging you may wish to do and where you want to store that data. then create the infrastructure and the mandantory shell/frame for each test, then individual testers have to work within that shell. Our current infrastructure is in python which lent itself to this nicely, and we are not restricted to only python based tests we can use C or python and the target can run whatever languages/programs it can run. Abstraction layers are good, we use a simple read/write of an address to access the unit under test, and with that we can test against a simulation of the target or against real hardware when the hardware arrives. We can access the hardware through a serial debugger, or jtag or pcie, and the majority of the tests dont know or care because the are on the other side of the abstraction.