Grails integration tests failing in a (seemingly) random and non-repeatable way - testing

We are writing integration tests for our Grails 2.0.0 application with the help of the Fixtures and Buid-Test-Data plugins.
During testing, it was discovered that the integration test fail at certain times, and pass at other times. Running 'test-app' sometimes results in all tests passing, and sometimes results in some of our tests failing.
When the tests fail, they are caused by a unique constraint being violated during the insert of an instance of a domain class. This would indicate that there are still records in the test DB. I am running the H2 db, and have definitely got 'dbCreate = "create-drop"' in my DataSource.groovy.
Grails 2.0 integration test pollution? seems to indicate there is a significant test-pollution problem in Grails. Are there any solutions to this? Have I hit Grails-8530?
[Edit] the test-pollution seems to be caused by the unit tests. We have sort-of proved this by deleting the unit tests and successfully running 'test-app' repeatedly.

When I run into errors like this I like to try and find the unit test(s) that is causing the problem. This might be kinda tricky since yours seem to only be failing on occasion.
1) I'd look at unit tests that were recently added. If this problem just started happening then that's a good place to look.
2) Metaclassing seems to be good at causing these type of errors so I'd look for metaclassing that isn't setup/torn down properly. Not as much of an issue with 2.0 as with <= 1.3.7 but could be the problem.
3) I wrote a plugin that executes your tests in a random order. Which might not help you solve your current problem. But what might help you is it prints out all of your tests so you can take what it gives you and run grails test-app <pasted list of unit tests> IntegrationTestThatIsFailing then start removing unit tests to find the culprit(s). ( http://grails.org/plugin/random-test-order). I found a bug in this with 2.0 that I haven't had time to fix yet (integration tests fail when asserting on rendered view name) but it should still print out your test names for you (which is better than doing it yourself :)

The fact integration tests fail with a constraint violation due to existing records reminds me of a situation I once encountered with functional tests (selenium) executing in unpredictable order, some of them not cleaning up the database properly. Sure, the situation with functional tests is different, since it is more difficult to restore the database state (The testcase cannot rollback a transaction in another jvm).
Although integration tests usually roll back transactions, it is still possible to break this behavior if your code controls transactions (commits) explicitly.
First, I would try forcing execution order as mentioned by Jarred in 3). Assuming you can then reproduce the behavior, I would then check transactional behaviour next. Setting the logging level of org.hibernate.transaction to debug should show you where transaction boundaries are.
Sorry, don't yet have a good explanation why wiping out the unit tests helps getting rid of the symptoms besides a general "possibly metaclassing issues". :)

Related

Best practice for writing tests that reproduce bugs

I am struggling a bit with the way how to write tests that reproduce an issue that has not been yet fixed.
Should one write the test and use wrong expectations and once the bug is fixed the developer will see the failure and adjust the expectations or should one just write the test with correct expectations and disable it. Once it is fixed you have to enable it again.
I would prefer the way to define wrong expectations and add the correct ones in comments and once I fix an issue I will immediately get a notification that it fails. If I disable it I won't see it failing and it will probably stay disabled until one will discover this test.
Are there any other ways doing this?
Thanks for your comments.
Martin
Ideally you would write a test that reproduces the bug and then fix said bug.
If for whatever reason that is not currently an option I would say that your approach of having the wrong expectations would be better than having an ignored test. Assuming that you use some clear variable name/ method name / comments that the test is more a placeholder and not the desired outcome.
One thing that I've done is write a test that is a "time bomb" reminder. I pick a date that is a few weeks/months out from now that I expect to be able to get back to it or have it fixed by. If I end up having to push the date out 2 or 3 times I end up deleting the test because it must not be that important.
as #Jarred said, best way is to write a test that express the correct expectations, check if it fails, then fix production code and see the test passes.
if it's not an option then remember that tests are not only to test but also to document. so write a test that document how your program does actually work. if necessary add a comment to the test. and don't write tests that are ignored - it's pointless. in future you can refactor your code many times, you could accidentally fix this test or introduce even more error in this area. writing tests that are intended to be long term ignored is just a waste of time.
don't be afraid that you will forget about that particular bug/test, just create a ticket in your issue tracking system - that's what it's made for.
if you use a testing framework that supports groups, you can add all those tests to be able to instantly exclude those test if needed.
also i really don't like the concept of 'time bomb tests'. your build MUST be reproducible - that's the fundamental assumption of release management, continuous integration, ability to pass your code to another team etc. tests are not meant to track and remind about the issues, it's the job of the issue tracking system. seriously, don't do it
Actually I thought about this again. We are using JUnit and it supports defining expectations on exceptions via #Test(expected=Exception.class).
So what one can do is write the test with the desired expectations and define the test with #Test(expected=AssertionError.class). Once the test will be fixed the test starts failing and the developer has to remove the expectation.

Ordering tests in TFS 2012

There are a few tests in my testing solution that must be run first or else later tests will fail. I want a way to ensure these are run first and in a specific order. Is there any way of doing this other than using a .orderedtest file?
Some problems with the .orderedtest:
Certain tests should be run in a random order after the "set up" tests are finished
Ordered test does not seem to call the ClassInitialize method
Isn't an orderedtest a form or test list that is deprecated in VS/TFS 2012?
My advice would be to fix your tests to remove the dependencies (i.e. make them proper "unit" tests) - otherwise they are bound to cause problems later, e.g.:
causing a simple failure to cascade so that hundreds of tests fail and make it hard to find the root cause
failing unexpectedly because someone has inadvertently modified the execution order
reporting passes when in fact they should be failing, just because the initial state is not as they required
You could try approaches like:
keep the tests separate but make each of them set up and tear down the test environment that they require. (A shared class to provide the initial state would be helpful here)
merge the related tests into a single one, so that you can control the setup, execution, and close-down in a robust way.

Should I commit having test still to pass(failing)?

Our rails development team tries to follow Continuous Integration. We have decided to adopt a policy of only committing features whose tests pass. Is that a good way to go on? Should I delay integrating with other one's features until my tests pass(Even if the partial part of the feature works ok)? Thanks in advance
The tests should pass--if you're running a CI server it'll just spam people with emails until they do. Without a CI server everyone else will have to figure out if those tests are "supposed" to fail. Boo.
Another option is to only check in tests for actually-written features; if you're using tests as an executable specification they wouldn't all pass until the entire app was done and nobody would be able to check anything in ever.
You may also be able to mark tests as "pending" or indicate they should be skipped, but remembering to un–pend/-skip them is often problematic.
The tests SHOULD PASS that's the reason why you are writing them in the first place, if for some reason one or more tests do not pass, it indicates that something went wrong (obviously) and you and your team should be working on the solution.
If the code were committed with test failures, spam mails blaming the programmer who did it, this way the next time he will pay more attention before committing code
I have heard one way to avoid committing code with test failures but I have not personally tested, it involves to have two repositories (it could be a branch), the theory behind is:
The developers commits will target a branch, the purpose of this branch is just to guarantee that all tests pass, you should configure your CI server to build and run tests from this branch
When all the tests pass in the branch, a merge should be done to the trunk, since everyone should be working on this branch the merge should be transparent and automatic
I repeat I have not tested this approach and in my opinion it involves more problems than it solves
Another alternative could be to add a hook to the commit event in your VCS and force to run all tests but this could be time consuming just to perform a single commit
As additional info you could check this response
https://stackoverflow.com/a/7110774/1268570
I would wait personally to the test passes before I intergrate other features.

How do you compare the results of two nunit test runs?

We currently have a situation where several tests are failing. Someone is working on this, but it is not me. I have been tasked with other work. So I plan on running the tests in NUnit before I begin my work so I have a base line of failing tests and what the failure message is. I would like to use this result to verify that those tests fail with the exact same failure result while testing my own code. are there any resources that would allow me to do this?
update
I'm aware of the ExpectedException attribute. However that will not work for the tests that are failing the test condition. Also there are thousands of tests of which only about 100 tests are failing. I was hoping for something that would compare the two test runs and show me the differences.
I'd throw an
[Ignore("SomeCustomStringICanFindLater")]
attribute on the failing tests until they are fixed.
See IgnoreAttribute.
And try to convince your manager that a broken build should be everyone's top priority.
After doing some research while waiting on an answer. I found that the console runner produces xml output. I can use a diff tool to compare the two test runs and see which tests failed differently than the baseline test run.

How to protect yourself when refactoring non-regression tests?

Are there specific techniques to consider when refactoring the non-regression tests? The code is usually pretty simple, but it's obviously not included into the safety net of a test suite...
When building a non-regression test, I first ensure that it really exhibits the issue that I want to correct, of course. But if I come back later to this test because I want to refactor it (e.g. I just added another very similar test), I usually can't put the code-under-test back in a state where it was exhibiting the first issue. So I can't be sure that the test, once refactored, is still exercising the same paths in the code.
Are there specific techniques to deal with this issue, except being extra careful?
It's not a big problem. The tests test the code, and the code tests the tests. Although it's possible to make a clumsy mistake that causes the test to start passing under all circumstances, it's not likely. You'll be running the tests again and again, so the tests and the code they test gets a lot of exercise, and when things change for the worse, tests generally start failing.
Of course, be careful; of course, run the tests immediately before and after refactoring. If you're uncomfortable about your refactoring, do it in a way that allows you to see the test working (passing and failing). Find a reliable way to fail each test before the refactoring, and write it down. Get to green - all tests passing - then refactor the test. Run the tests; still green? Good. (If not, of course, get green, perhaps by starting over). Perform the changes that made the original unrefactored tests fail. Red? Same failure as before? Then reinstate the working code, and check for green again. Check it in and move onto your next task.
Try to include not only positive cases in your automated test, but also negative cases (and a proper handler for them).
Also, you can try to run your refactored automated test with breakpoints and supervise through the debugger that it keeps on exercising all the paths you intended it to exercise.