CruiseControl.Net and NUnit - tests finish, but task doesn't? - msbuild

I've got a CruiseControl.Net setup using Nant to clean the previous logs, and then it kicks off a msbuild of a VS project, finally running nunit-console to execute the tests.
It seems to build for a few seconds (fine) and then hops on to running the 600 tests, which takes about a minute. However even though the log files are there, it sits there doing 'nothing' for 10 minutes, at which point the built times out and the process exits. The CruiseControl.NET webpage then shows the result as failed, with an exception:
ThoughtWorks.CruiseControl.Core.Tasks.BuilderException: Command Line Build timed out (after 600 seconds)
at ThoughtWorks.CruiseControl.Core.Tasks.ExecutableTask.Execute(IIntegrationResult result)
at ThoughtWorks.CruiseControl.Core.Tasks.TaskBase.Run(IIntegrationResult result)
at ThoughtWorks.CruiseControl.Core.Project.RunTask(ITask task, IIntegrationResult result, Boolean isPublisher)
at ThoughtWorks.CruiseControl.Core.Project.RunTasks(IIntegrationResult result, IList tasksToRun, Dictionary`2 parameterValues)
at ThoughtWorks.CruiseControl.Core.Project.Run(IIntegrationResult result)
at ThoughtWorks.CruiseControl.Core.IntegrationRunner.Build(IIntegrationResult result)
at ThoughtWorks.CruiseControl.Core.IntegrationRunner.Integrate(IntegrationRequest request) BaseDirectory: , Executable: C:\Program Files\NUnit 2.5.8\bin\net-2.0\nunit-console.exe
The ccnet.config script is below. I've tried changing the timeout to 3 minutes just in case that was something to do with it, but even if that did work (it didn't) it's a dodgy hack, as by rights when the tests are finished running, they should exit gracefully!
I've run the command at the commandline and confirmed it only takes about a minute to run. Any theories?
<cruisecontrol xmlns:cb="urn:ccnet.config.builder">
<project name="CodeTests">
<workingDirectory>C:\Source\Wholesale\Comp.EventControl.TestingFramework\</workingDirectory>
<artifactDirectory>C:\Source\Wholesale\Comp.EventControl.TestingFramework\</artifactDirectory>
<prebuild>
<!-- clean nunit output to avoid CCNET reporting
about previous build tests if current build fails -->
<nant>
<executable>C:\Nant\bin\nant.exe
</executable>
<baseDirectory>C:\Source\Wholesale\Comp.EventControl.TestingFramework</baseDirectory>
<nologo>false</nologo>
<buildFile>nant.build</buildFile>
<targetList>
<target>cleanNunit</target>
</targetList>
</nant>
</prebuild>
<tasks>
<msbuild>
<executable>C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\MSBuild.exe
</executable>
<workingDirectory>C:\Source\Wholesale\Comp.EventControl.TestingFramework\CodeReboot
</workingDirectory>
<projectFile>CodeReboot.sln</projectFile >
<buildArgs>/noconsolelogger
/v:quiet
/noconlog
/p:Configuration=Debug
/p:ReferencePath="C:\Program Files\NUnit 2.5.8\bin;C:\Program Files\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0"
/p:AdditionalReferencePath="C:\Program Files\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0"
</buildArgs>
<targets>ReBuild</targets >
<timeout>180</timeout >
<logger>c:\Program Files\CruiseControl.NET\server\Rodemeyer.MsBuildToCCNet.dll</logger>
</msbuild>
<exec>
<executable>C:\Program Files\NUnit 2.5.8\bin\net-2.0\nunit-console.exe
</executable >
<buildArgs>/xml:C:\Source\Wholesale\Comp.EventControl.TestingFramework\nunit-results.xml
/nologo C:\Source\Wholesale\Comp.EventControl.TestingFramework\CodeReboot\CodeReboot\bin\Debug\CodeReboot.dll
</buildArgs>
</exec>
</tasks>
<publishers>
<merge>
<files>
<file>C:\Source\Wholesale\Comp.EventControl.TestingFramework\nunit-results.xml
</file>
</files>
</merge>
<xmllogger />
<statistics />
<artifactcleanup cleanUpMethod="KeepLastXBuilds"
cleanUpValue="20" />
</publishers>
</project>
</cruisecontrol>

What version of nunit are you using? There were some problems in the 2.5.7/2.5.8 version that causes nunit-agent to hang at the end of a test. I had this problem and reverted back to an older version of nunit and the hang problem went away. In the release notes for 2.5.9 they show that "602761 nunit-agent hangs after tests complete" has been fixed. I have not upgraded to 2.5.9 yet but that may fix your problem.

The first thing you should do try and replicate the problem locally. Do your tests run properly and exit cleanly? Run them locally and see if the nunit process hangs around.
If your tests run fine locally, try and figure out which test the build server is stalling on or whether it is completing all of the tests. If your build scripts cannot be run locally, it's a bit harder to diagnose.
Back in the day when our tests were riddled with threaded code, it was nearly always the case that a test (or the code under test) was creating threads and then failing to shut them down, causing NUnit to hang around. Threading issues are irritating because they're non-deterministic may only be visible when run on machines with more cores (like .. build servers).
If you have tests with threading / external dependencies, I'd try disabling those first (the fact that 600 tests takes a minute to run indicates that external dependencies and/or threading is involved, as a unit test usually takes ~1ms to run).

Related

The mystery of stuck inactive msbuild.exe processes, locked Stylecop.dll, Nuget AccessViolationException and CI builds clashing with each other

Observations:
On our Jenkins build server, we were seeing lots of msbuild.exe processes (~100) hanging around after job completion with around 20mb memory usage and 0% CPU activity.
Builds using different versions of stylecop were intermittently failing:
workspace\packages\StyleCop.MSBuild.4.7.41.0\tools\StyleCop.targets(109,7):
error MSB4131: The "ViolationCount" parameter is not supported by the "StyleCopTask" task.
Verify the parameter exists on the task, and it is a gettable public instance property.
Nuget.exe was intermittently exiting with the following access violation error (0x0000005):
.\workspace\.nuget\nuget install .\workspace\packages.config -o .\workspace\packages"
exited with code -1073741819.
MsBuild was launched in the following way via a Jenkins Matrix job, with 'BuildInParallel' enabled:
`msbuild /t:%Targets% /m
/p:Client=%Client%;LOCAL_BUILD=%LOCAL_BUILD%;BUILD_NUMBER=%BUILD_NUMBER%;
JOB_NAME=%JOB_NAME%;Env=%Env%;Configuration=%Configuration%;Platform=%Platform%;
Clean=%Clean%; %~dp0\_Jenkins\Build.proj`
After a lot of digging around and trying various things to no effect, I eventually ended up creating a new minimal solution which reproduced the issue with very little else going on. The issue turned out to be caused by msbuild's multi-core parallelisation - the 'm' parameter.
The 'm' parameter tells msbuild to spawn "nodes", these will remain alive after the build has ended, and are then re-used by new builds!
The StyleCop 'ViolationCount' error was caused by a given build re-using an old version of the stylecop.dll from another build's workspace, where ViolationCount was not supported. This was odd, because the CI workspace only contained the new version. It seems that once the StyleCop.dll was loaded into a given MsBuild node, it would remain loaded for the next build. I can only assume this is because StyleCop loads some sort of singleton into the nodes processs? This also explains the file-locking between builds.
The nuget access violation crash has now gone (with no other changes), so is evidently related to the above node re-use issue.
As the 'm' parameter defaults to the number of cores - we were seeing 24 msbuild instances created on our build server for a given job.
The following posts were helpful:
msbuild.exe staying open, locking files
http://www.hanselman.com/blog/FasterBuildsWithMSBuildUsingParallelBuildsAndMulticoreCPUs.aspx
http://stylecop.codeplex.com/discussions/394606
https://github.com/Glimpse/Glimpse/issues/115
http://msdn.microsoft.com/en-us/library/vstudio/ms164311.aspx
The fix:
Add the line set MSBUILDDISABLENODEREUSE=1 to the batch file which launches msbuild
Launch msbuild with /m:4 /nr:false
The 'nr' paremeter tells msbuild to not use "Node Reuse" - so msbuild instances are closed after the build is completed and no longer clash with each other - resulting in the above errors.
The 'm' parameter is set to 4 to stop too many nodes spawning per-job
I had the same issue. One old reference I found was in csproj files
<PropertyGroup>
<StyleCopMSBuildTargetsFile>..\packages\StyleCop.MSBuild.4.7.48.0\tools\StyleCop.targets</StyleCopMSBuildTargetsFile>
Also, I deleted the entire "Packages" folder that's located in the same folder as sln file after I closed the visual studio. It triggered VS to rebuild the folder and let go of the cache of the old version of stylecop
I've had the same issue for a while, builds were taking over 6 minutes to finish after some digging I found our it's node reuse fault so adding /m:4 /nr:false fixing my issue immediately

Running Multiple Scripts in Sahi

I want to run all the scripts one by one sequentially. I've created a suite file and included scripts in suite. When I run a suite, the scripts run in parallel in multiple browsers. I would like to run them one after the another in a single browser.
You can run a suite file from a single browser by changing the Threads in the ant target.
<target name="runbrowsertests">
<sahi suite="../userdata/scripts/demo/demo.suite"
browserType="firefox"
baseurl="http://sahi.co.in/demo/"
sahihost="localhost"
sahiport="9999"
failureproperty="sahi.failed"
haltonfailure="false"
threads="1"
>
<report type="html" />
</sahi>
</target>
If it still doesn't work, edit browser_types.xml (Click "configure" link on dashboard). Change <capacity> to 1 for the browser that you want to run the tests with. Restart Sahi.

MSTest can not publish results. Says that platform and flavour are not right and they are

we are trying to execute unit tests with MSTest from command line and publishing the results at the TFS server. The problem is that MSTest is always returning:
Publishing results of test run buildmachine#XXX-XXXXXXX 2010-12-16 11:39:13_Release_Any
CPU to http://xxxx:8080/Build/v1.0/PublishTestResultsBuildService2.asmx...
.........................................................................................
Build 'xxx>xxx>x>x>x>xxxx>xxxx>x.x.x.xxx' does not include the specified
configuration ('Release/Any CPU').
The problem is that the specified configuration should exist. We've build with the next MSBuild settings:
<ConfigurationToBuild Include="Release|Any CPU">
<FlavorToBuild>Release</FlavorToBuild>
<PlatformToBuild>Any CPU</PlatformToBuild>
</ConfigurationToBuild>
Any idea? I'm starting to be fed up with this.
The problem was that the build didn't succeeded and then MSTest couldn't publish the results. The error message could be better ...

nant vs. msbuild: stopping a service

I'm trying to decide which side I'm on in the MsBuild vs. Nant war. I'm starting with: stop a service, deploy some files, restart the service. Just from looking at these two links, that is much easier to do in Nant.
MSBuild: Example of using Service Exists MSBuild task in Microsoft.Sdc.Tasks?
<target name="service_exists">
<script language="C#">
<references>
<include name="System.ServiceProcess.dll" />
</references>
<code><![CDATA[
public static void ScriptMain(Project project) {
String serviceName = project.Properties["service.name"];
project.Properties["service.exists"] = "false";
project.Properties["service.running"] = "false";
System.ServiceProcess.ServiceController[] scServices;
scServices = System.ServiceProcess.ServiceController.GetServices();
foreach (System.ServiceProcess.ServiceController scTemp in scServices)
{
etc...
Nant: http://ryepup.unwashedmeme.com/blog/2007/01/04/restart-a-windows-service-remotely/
<!-- Send the stop request -->
<exec program="sc.exe">
<arg line="\\server stop shibd_Default"/>
</exec>
<!-- Sleep a little bit, to give the service a chance to stop -->
<sleep seconds="5"/>
<!-- Send the start request -->
<exec program="sc.exe">
<arg line="\\server start shibd_Default"/>
</exec>
I wonder if the SO community agrees with me. Is it much easier to get basic things like this done in Nant? Sure looks that way. C# code in a CDATA block? WTF?
Our current build process is a) lots of bat files b) lots of cursing. I'd really like to find a good replacement, but that MsBuild stuff looks like a world of pain to my eyes. I'm thinking the way to go is to build scripts in Nant, then use MsBuild to do any .NET builds that need to be done.
One important question: which one is better at catching errors in the script before the script is run? I was thinking of rolling my own here and that was very important part of it: line up all your data and make sure that it makes sense before attempting to run.
In msbuild you could also use the ServiceController task that is packaged in the msbuild community tasks.
You can execute sc.exe using MSBuild every bit as easily ...
<Exec Command="sc.exe \\server stop shibd_Default" />
By default this will "fail" if the exit code (of sc.exe) is non-zero, but that can be customized.
With Nant, there are 2 other ways to stop a service, and one is able to track an error.
First one (using Net Stop):
<exec program="net" failonerror="false"><arg value="stop"/><arg value="${serviceName}"/></exec>
Second one (much cleaner):
<servicecontroller action="Stop" service="${serviceName}" if="${service::is-installed(serviceName,'.') and service::is-running(serviceName,'.')}" />
Note that the second line verifies that the service already exists and is running, which allows to track any weird error.
In addition to #nulpptr's answer for MSBuild, if you don't have the option of using the community tasks, you might have to resort to a hack to wait for your service to stop before moving on. If you have the resource kit you can use the EXEC task with the sleep command.
No resource kit? Use the ping trick...
However, if you don't have the resource kit, you can use the ping trick to force a delay. For instance, the following will stop your service using the sc command, and then pause for about 5 seconds:
<Exec Command="sc.exe \\server stop shibd_Default" ContinueOnError="true" />
<Exec Command="ping 127.0.0.1 -n 5 > nul" ContinueOnError="true" />

Process timeout without showing any error in test execution using cc.net

nunit tests fails when run through cc.net saying process timeout. Process has been killed
All works fine when through nUNit or VS.
Also cc.net will then show the results of previous build even if the build is a clean one.
Any help plz.
The default timeout is 600 seconds. If your tests start to exceed that the build will fail with no indication. You may need to up the timeouts for your cc.net nunit task
If you are seeing the results from a previous build, it is probably because you are not deleting the results from your previous build.
For example, my NUnit test results are written to files with the name {foo}-results.xml:
<publishers>
<merge>
<files>
<file>bin\debug\*-results.xml</file>
</files>
</merge>
</publishers>
In my tasks, I have a step in my build file that deletes the entire "bin\debug" directory so that my results are always the current ones.
One possibility is that you have a permission issue. CruiseControl is perhaps running under a service account and has different permissions than your user account (which I'm assuming you use to manually run the tests.) Try logging into the machine as the service account, then see if you can run the unit tests through VS or NUnit.
I've seen this happen if a test has an assertion, e.g. Debug.Assert(something here). When this happens to me in CC.Net, the CC.Net build pops up a message box for the assertion. Since no one closes out the message box on the build server, the NUnit test times out.