How do you compare the results of two nunit test runs? - testing

We currently have a situation where several tests are failing. Someone is working on this, but it is not me. I have been tasked with other work. So I plan on running the tests in NUnit before I begin my work so I have a base line of failing tests and what the failure message is. I would like to use this result to verify that those tests fail with the exact same failure result while testing my own code. are there any resources that would allow me to do this?
update
I'm aware of the ExpectedException attribute. However that will not work for the tests that are failing the test condition. Also there are thousands of tests of which only about 100 tests are failing. I was hoping for something that would compare the two test runs and show me the differences.

I'd throw an
[Ignore("SomeCustomStringICanFindLater")]
attribute on the failing tests until they are fixed.
See IgnoreAttribute.
And try to convince your manager that a broken build should be everyone's top priority.

After doing some research while waiting on an answer. I found that the console runner produces xml output. I can use a diff tool to compare the two test runs and see which tests failed differently than the baseline test run.

Related

Selenium Best Practice: One Long Test or Several Successively Long Tests?

In Selenium I often find myself making tests like ...
// Test #1
login();
// Test #2
login();
goToPageFoo();
// Test #3
login();
goToPageFoo();
doSomethingOnPageFoo();
// ...
In a unit testing environment, you'd want separate tests for each piece (ie. one for login, one for goToPageFoo, etc.) so that when a test fails you know exactly what went wrong. However, I'm not sure this is a good practice in Selenium.
It seems to result in a lot of redundant tests, and the "know what went wrong" problem doesn't seem so bad since it's usually clear what went wrong by looking at the what step the test was on. And it certainly takes longer to run a bunch of "build up" tests than it takes to run just the last ("built up") test.
Am I missing anything, or should I just have a single long test and skip all the shorter ones building up to it?
I have built a large test suite in Selenium using a lot of smaller tests (like in your code example). I did it for exactly the same reasons you did. To know "what went wrong" on a test failure.
This is a common best practice for standard unit tests, but if I had to do it over again, I would go mostly with the second approach. Larger built-up tests with some smaller tests when needed.
The reason is that Selenium tests take an order of magnitude longer than standard unit tests to run, particularly on longer scenarios. This makes the whole test suite unbearably long with most of the time being spent on running the same redundant code over and over again.
When you do get an error, say in a step that is repeated at the beginning of 20+ different tests, it does not really help to know you got the same error 20+ times. My test runner runs my test out of order so my first error isn't even on the first incremental test of the "build-up" series so I end up looking at the first test failure and it's error message to see where the failure came from. The same thing I would do with if I had used larger "built-up" tests.

Running test coverage separately from test execution

We're configuring build steps in TeamCity. Since we have huge problems with test coverage reports (they were there and then inexplainably vanished), we're trying to find a work-trough (asking and bouting a question directly related to our issue yielded very cold response).
Please note that I'm not looking for an opinion but rather a technical knowledge base to support (or kill) the choice of ours. And yes, I've checked the build logs - these are posted in the other thread. This question is about the (in?)sanity of trying an alternative approach. :)
Is it recommended to run a build step for test and then another build step for test coverage?
Does it even makes sense to run these in separate build steps?!
What gains and disadvantages are there to running coverage bundled with/separately from the tests themselves?
Test coverage reports are generated during unit test runs. Unless your problem is with reading generated reports, it doesn't make sense to "run them in separate build steps". Test coverage tells you what parts of your code were run WHILE the tests were running- I don't see how they could be independent.
It may make more sense to ask for help with the test coverage reports no longer being generated...

Ordering tests in TFS 2012

There are a few tests in my testing solution that must be run first or else later tests will fail. I want a way to ensure these are run first and in a specific order. Is there any way of doing this other than using a .orderedtest file?
Some problems with the .orderedtest:
Certain tests should be run in a random order after the "set up" tests are finished
Ordered test does not seem to call the ClassInitialize method
Isn't an orderedtest a form or test list that is deprecated in VS/TFS 2012?
My advice would be to fix your tests to remove the dependencies (i.e. make them proper "unit" tests) - otherwise they are bound to cause problems later, e.g.:
causing a simple failure to cascade so that hundreds of tests fail and make it hard to find the root cause
failing unexpectedly because someone has inadvertently modified the execution order
reporting passes when in fact they should be failing, just because the initial state is not as they required
You could try approaches like:
keep the tests separate but make each of them set up and tear down the test environment that they require. (A shared class to provide the initial state would be helpful here)
merge the related tests into a single one, so that you can control the setup, execution, and close-down in a robust way.

Grails integration tests failing in a (seemingly) random and non-repeatable way

We are writing integration tests for our Grails 2.0.0 application with the help of the Fixtures and Buid-Test-Data plugins.
During testing, it was discovered that the integration test fail at certain times, and pass at other times. Running 'test-app' sometimes results in all tests passing, and sometimes results in some of our tests failing.
When the tests fail, they are caused by a unique constraint being violated during the insert of an instance of a domain class. This would indicate that there are still records in the test DB. I am running the H2 db, and have definitely got 'dbCreate = "create-drop"' in my DataSource.groovy.
Grails 2.0 integration test pollution? seems to indicate there is a significant test-pollution problem in Grails. Are there any solutions to this? Have I hit Grails-8530?
[Edit] the test-pollution seems to be caused by the unit tests. We have sort-of proved this by deleting the unit tests and successfully running 'test-app' repeatedly.
When I run into errors like this I like to try and find the unit test(s) that is causing the problem. This might be kinda tricky since yours seem to only be failing on occasion.
1) I'd look at unit tests that were recently added. If this problem just started happening then that's a good place to look.
2) Metaclassing seems to be good at causing these type of errors so I'd look for metaclassing that isn't setup/torn down properly. Not as much of an issue with 2.0 as with <= 1.3.7 but could be the problem.
3) I wrote a plugin that executes your tests in a random order. Which might not help you solve your current problem. But what might help you is it prints out all of your tests so you can take what it gives you and run grails test-app <pasted list of unit tests> IntegrationTestThatIsFailing then start removing unit tests to find the culprit(s). ( http://grails.org/plugin/random-test-order). I found a bug in this with 2.0 that I haven't had time to fix yet (integration tests fail when asserting on rendered view name) but it should still print out your test names for you (which is better than doing it yourself :)
The fact integration tests fail with a constraint violation due to existing records reminds me of a situation I once encountered with functional tests (selenium) executing in unpredictable order, some of them not cleaning up the database properly. Sure, the situation with functional tests is different, since it is more difficult to restore the database state (The testcase cannot rollback a transaction in another jvm).
Although integration tests usually roll back transactions, it is still possible to break this behavior if your code controls transactions (commits) explicitly.
First, I would try forcing execution order as mentioned by Jarred in 3). Assuming you can then reproduce the behavior, I would then check transactional behaviour next. Setting the logging level of org.hibernate.transaction to debug should show you where transaction boundaries are.
Sorry, don't yet have a good explanation why wiping out the unit tests helps getting rid of the symptoms besides a general "possibly metaclassing issues". :)

Grails, Hudson, and Cobertura, which tests are covering my code?

I just started working on an existing grails project where there is a lot of code written and not much is covered by tests. The project is using Hudson with the Cobertura plugin which is nice. As I'm going through things, I'm noticing that even though there are not specific test classes written for code, it is being covered. Is there any easy way to see what tests are covering the code? It would save me a bit of time if I was able to know that information.
Thanks
What you want to do is collect test coverage data per test. Then when some block of code isn't exercised by a test, you can trace it back to the test.
You need a test coverage tool which will do that; AFAIK, this is straightforward to organize. Just run one test and collect test coverage data.
However, most folks also want to know, what is the coverage of the application given all the tests? You could run the tests twice, once to get what-does-this-test-cover information, and then the whole batch to get what-does-the-batch-cover. Some tools ( ours included) will let you combine the coverage from the individual tests, to produce covverage for the set, so you don't have to run them twice.
Our tools have one nice extra: if you collect test-specific coverage, when you modify the code, the tool can tell which individual tests need to be re-run. You need a bit of straightforward scripting for this, to compare the results of the instrumentation data for the changed sources to the results for each test.