How can an uncalled test affect another in Go? - testing

I have a test function TestJobqueue() in https://github.com/VertebrateResequencing/wr/blob/develop/jobqueue/jobqueue_test.go that I can call in isolation: go test -tags netgo ./jobqueue -v -run 'TestJobqueue$'.
I recently started getting test failures related to boltdb (one of my dependencies) bombing out with signal SIGBUS: bus error code panics, or just normally failing tests because the database couldn't be opened. But only when working off an NFS mounted directory. Fair enough, I or boltdb have some kind of NFS-related bug.
But the thing I can't wrap my head around is that I only get these errors when an entirely different test function exists.
As per the comments in TestREST() in https://github.com/VertebrateResequencing/wr/blob/92fb61ccd7819c8f1edfa8cce8468c4250d40ea7/jobqueue/rest_test.go, if I call Serve(serverConfig) (a function in the package being tested, a function call which is made many times in TestJobqueue() and other test functions) in that test function, TestJobqueue() fails. If I don't, it doesn't.
In short, the failure of tests in one test function can be controlled by the value of a boolean in a test function that I'm not running.
How is this possible?
Edit: to address some points brought up by the first answer, TestJobqueue() is being run in isolation. No other test runs before or after it. If the database file already exists, Serve() results in those files being deleted first, then a new one created to run the new set of tests. The odd thing that I'm seeking an answer for is how an unexecuted function can have this side effect. I can demonstrate it is really unexecuted by beginning or ending TestREST() with a panic call: the output of that panic is never seen, but TestJobqueue() failure can still be controlled by the boolean in TestREST() (if the panic comes at the end).
Edit2: this turns out to be caused by an unusual thing I do in TestJobqueue(), which is to call go test on itself. Needless to say, if you do this, strange things can happen...

In short, the failure of tests in one test function can be controlled by the value of a boolean in a test function that I'm not running.
This is not a great summary. Your test starts a server. The other test starts a server, clearly, the problem is there. You appear to have commented out the bit of code that stops the server at the end of the test? You can't run two servers on the same port.
You probably have a port conflict or some network condition that is triggered by running the two servers at once, because they both appear to use a similar (identical?) config loaded like this:
config := internal.ConfigLoad("development", true)
Running with no config uses default values, avoiding the conflict, running with config causes the conflict. So to pin it down, try creating a config with one setting at a time till you find the config setting that causes the problem (most likely Port or WebPort). Alternatively, make sure the tests stop the server at the end.
[EDIT] Looks like you have narrowed it down to DBFile config setting by changing one at a time. This implies the server starts a new db instance - if both try to use the same file for a new db, this would cause contention and the second test to run would fail.
It's not entirely clear from your description above what you're doing or what the problem is, so you could try to improve that to state exactly the sequence of actions and the problem. If for example you have previously run a test which creates a db, it could affect later test runs because of the presence of a db file, so your tests are not completely independent.
[EDIT 2 - after further edits to question]
If commenting out TestREST completely solves your problem (or a panic before it starts), and given changing it breaks the other test, you are executing TestREST somehow.
Looking at your code for jobqueue_test, it appears to invoke go test so you might be running more tests that you assume? Given you don't see the panic output I'd suspect your use of exec.Command in this big test. Try removing bits of the failing test till it works to narrow down exactly which invocation is running the other test. Calling go test within a test is pretty unusual!
https://github.com/VertebrateResequencing/wr/blob/develop/jobqueue/jobqueue_test.go#L2445

Related

go tests: clean up after panic

Suppose that I set up a Docker container with a DB for my tests, and I do that in testing.TestMain because I want this to be done once and globally. I write a defer statement in this testing.Main() that does the cleanup (namely, removes the DB container).
Now, suppose that something goes wrong and my tests panic. This issue tells me that I can not write custom recover code to ensure that the container is removed. This is true: testing.M.Run() does its own recover() call, and it looks like there's no way to override its behaviour.
The question is: what should I do to make sure that my cleanup code is executed no matter what?
As noted in the issue you linked to:
The panic could come from a goroutine started by a test and the
testing package can't add a defer to those goroutines to catch the
panic.
Aadditionally, some panics are impossible to recover, e.g. out of
menory or runtime memory corruption.
In short, you cannot make sure that any code will be executed in all circumstances.
If your cleanup is non-critical, you can do it before & after (i.e. at the start of your tests, check if your container already exists and destroy it before creating a new one, then make a best effort to destroy it at the end). If your cleanup is critical, then wrap your go test call with something (e.g. a shell script or makefile) and make the wrapper responsible for the setup & teardown of external dependencies.

Selenium grid runs out of free slots

I have a large suite of SpecFlow tests executing against a selenium grid running locally. The grid has a single host configured for max 10 firefox instances. The tests are run from NUnit serially, so I would only expect to require a single session at a time.
However, when approximately half of the test cases have been run, the console window reporting output from the hub starts reporting
INFO: Node host [url] has no free slots
Why?
All the test cases are associated with a TearDown method that closes and disposes the WebDriver, although I haven't verified that absolutely every test gets to this method without failing. I would expect a maximum of one session to be active at once. How can I find out what is stopping the host from recycling those sessions?
edit #1:
I think I've narrowed down the cause of the issue - it is indeed to do with not closing the WebDriver. There are [AfterScenario] attributes on the teardown methods that are meant to do this, but they only match a subset of scenarios as they have parameters on them. Removing the parameter so that the teardown associates with every scenario fixes the session exhaustion (or seems to) but there are some tests that expect to reacquire an existing session, so I'll have to fix them separately.
A bit of background: This test suite was inherited as part of a 'complete' solution and it's been left untouched and never run since delivery. I'm putting it back into service and have had to discover its quirks as I go - I didn't write any of this. I've had brief encounters with both Selenium and SpecFlow but never used the two together.
The issue turned out to be a facepalm-level fail - mostly in the sense that I didn't spot it. Some logging code was trying to write to a file that wasn't there, the thrown exception bypassed the call to Dispose() on the WebDriver, and was then swallowed with no error reporting. Therefore the sessions were hanging around. Removing the logging code fixed the session exhaustion.
Look on the node (remote desktop) and see what is happening on the box. It does sound like your test isn't closing out it's session properly.

PHPUnit & Selenium code coverage - coverage metrics stop halfway through test

I'm just getting started with PHPUnit and Selenium, yet one problem has been bothering me: I can't seem to get correct coverage figures.
My app takes a user through a multi-step process that involves multiple pages, each of which is handled in PHP by a display function (to output HTML) and a processing function (to handle the results of POST operations). My baseline test runs through the entire process, and completes correctly having visited each of about seven pages. I've both verified this visually and through assertions in the testcase itself.
This issue is that the coverage report indicates that only the first couple of functions are executed and that the others are never visited (despite my visual and testcase checks). I thought the problem was a PHP Notice that occurred during the first function and that might stop XDebug/PHPUnit from collecting stats, but I fixed this and the problem remains.
Is there anything that can stop collection of coverage statistics mid-way through a test? All the functions in question are in the same file and are called from a (different) central PHP script which chooses which function to call based on an incrementing session variable.

Entity Framework Code First - Tests Overlapping Each Other

My integration tests are use a live DB that's generated using the EF initalizers. When I run the tests individually they run as expected. However when I run them all at once, I get a lot of failed tests.
I appear to have some overlapping going on. For example, I have two tests that use the same setup method. This setup method builds & populates the DB. Both tests perform the same test ACT which adds a handful of items to the DB (the same items), but what's unique is each test is looking for different calculations (instead of one big test that does a lot of things).
One way I could solve this is to do some trickery in the setup that creates a unique DB for each test that's run, that way everything stays isolated. However the EF initilization stuff isn't working when I do that because it is creating a new DB rather than dropping & replacing it iwth a new one (the latter triggers the seeding).
Ideas on how to address this? Seems like an organization of my tests... just not show how to best go about it and was looking for input. Really don't want to have to manually run each test.
Use test setup and tear down methods provided by your test framework and start transaction in test setup and rollback the transaction in test tear down (example for NUnit). You can even put setup and tear down method to the base class for all tests and each test will after that run in its own transaction which will rollback at the end of the test and put the database to its initial state.
Next to what Ladislav mentioned you can also use what's called a Delta Assertion.
For example, suppose you test adding a new Order to the SUT.
You could create a test that Asserts that there is exactly 1 Order in the database at the end of the test.
But you can also create a Delta Assertion by first checking how many Orders there are in the database at the start of the test method. Then after adding an Order to the SUT you test that there are NumberOfOrdersAtStart + 1 in the database.

TeamCity: Managing deployment dependencies for acceptance tests?

I'm trying to configure a set of build configurations in TeamCity 6 and am trying to model a specific requirement in the cleanest possible manner way enabled by TeamCity.
I have a set of acceptance tests (around 4-8 suites of tests grouped by the functional area of the system they pertain to) that I wish to run in parallel (I'll model them as build configurations so they can be distributed across a set of agents).
From my initial research, it seems that having a AcceptanceTests meta-build config that pulls in the set of individual Acceptance test configs via Snapshot dependencies should do the trick. Then all I have to do is say that my Commit build config should trigger AcceptanceTests and they'll all get pulled in. So, lets say I also have AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC
So far, so good (I know I could also turn it around the other way and cause the Commit config to trigger AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC - problem there is I need to manually aggregate the results to determine the overall success of the acceptance tests as a whole).
The complicating bit is that while AcceptanceSuiteC just needs some Commit artifacts and can then live on it's own, AcceptanceSuiteA and AcceptanceSuiteB need to:
DeploySite (lets say it takes 2 minutes and I cant afford to spin up a completely isolated one just for this run)
Run tests against the deployed site
The problem is that I need to be able to ensure that:
the website only gets configured once
The website does not get clobbered while the two suites are running
If I set up DeploySite as a build config and have AcceptanceSuiteA and AcceptanceSuiteB pull it in as a snapshot dependency, AFAICT:
a subsequent or parallel run of AcceptanceSuiteB could trigger another DeploySite which would clobber the deployment that AcceptanceSuiteA and/or AcceptanceSuiteB are in the middle of using.
While I can say Limit the number of simultaneously running builds to force only one to happen at a time, I need to have one at a time and not while the dependent pieces are still running.
Is there a way in TeamCity to model such a hierarchy?
EDIT: Ideas:-
A crap solution is that DeploySite could set a 'in use flag' marker and then have the AcceptanceTests config clear that flag [after AcceptanceSuiteA and AcceptanceSuiteB have completed]. The problem then becomes one of having the next DeploySite down the pipeline wait until said gate has been opened again (Doing a blocking wait within the build, doesnt feel right - I want it to be flagged as 'not yet started' rather than looking like it's taking a long time to do something). However this sort of stuff a flag over here and have this bit check it is the sort of mutable state / flakiness smell I'm trying to get away from.
EDIT 2: if I could programmatically alter the agent configuration, I could set Agent Requirements to require InUse=false and then set the flag when a deploy starts and clear it after the tests have run
Seems you go look on the Jetbrains Devnet and YouTrack tracker first and remember to use the magic word clobber in your search.
Then you install groovy-plug and use the StartBuildPrecondition facility
To use the feature, add system.locks.readLock. or system.locks.writeLock. property to the build configuration.
The build with writeLock will only start when there are no builds running with read or write locks of the same name.
The build with readLock will only start when there are no builds running with write lock of the same name.
therein to manage the fact that the dependent configs 'read' and the DeploySite config 'writes' the shared item.
(This is not a full productised solution hence the tracker item remains open)
EDIT: And I still dont know whether the lock should be under Build Parameters|System Properties and what the exact name format should be, is it locks.writeLock.MYLOCKNAME (i.e., show up in config with reference syntax %system.locks.writeLock.MYLOCKNAME%) ?
Other puzzlers are: how does one manage giving builds triggered by build completion of a writeLock task read access - does the lock get dropped until the next one picks up (which would allow another writer in) - or is it necessary to have something queue up the parent and child dependency at the same time ?