The mystery of stuck inactive msbuild.exe processes, locked Stylecop.dll, Nuget AccessViolationException and CI builds clashing with each other - msbuild

Observations:
On our Jenkins build server, we were seeing lots of msbuild.exe processes (~100) hanging around after job completion with around 20mb memory usage and 0% CPU activity.
Builds using different versions of stylecop were intermittently failing:
workspace\packages\StyleCop.MSBuild.4.7.41.0\tools\StyleCop.targets(109,7):
error MSB4131: The "ViolationCount" parameter is not supported by the "StyleCopTask" task.
Verify the parameter exists on the task, and it is a gettable public instance property.
Nuget.exe was intermittently exiting with the following access violation error (0x0000005):
.\workspace\.nuget\nuget install .\workspace\packages.config -o .\workspace\packages"
exited with code -1073741819.
MsBuild was launched in the following way via a Jenkins Matrix job, with 'BuildInParallel' enabled:
`msbuild /t:%Targets% /m
/p:Client=%Client%;LOCAL_BUILD=%LOCAL_BUILD%;BUILD_NUMBER=%BUILD_NUMBER%;
JOB_NAME=%JOB_NAME%;Env=%Env%;Configuration=%Configuration%;Platform=%Platform%;
Clean=%Clean%; %~dp0\_Jenkins\Build.proj`

After a lot of digging around and trying various things to no effect, I eventually ended up creating a new minimal solution which reproduced the issue with very little else going on. The issue turned out to be caused by msbuild's multi-core parallelisation - the 'm' parameter.
The 'm' parameter tells msbuild to spawn "nodes", these will remain alive after the build has ended, and are then re-used by new builds!
The StyleCop 'ViolationCount' error was caused by a given build re-using an old version of the stylecop.dll from another build's workspace, where ViolationCount was not supported. This was odd, because the CI workspace only contained the new version. It seems that once the StyleCop.dll was loaded into a given MsBuild node, it would remain loaded for the next build. I can only assume this is because StyleCop loads some sort of singleton into the nodes processs? This also explains the file-locking between builds.
The nuget access violation crash has now gone (with no other changes), so is evidently related to the above node re-use issue.
As the 'm' parameter defaults to the number of cores - we were seeing 24 msbuild instances created on our build server for a given job.
The following posts were helpful:
msbuild.exe staying open, locking files
http://www.hanselman.com/blog/FasterBuildsWithMSBuildUsingParallelBuildsAndMulticoreCPUs.aspx
http://stylecop.codeplex.com/discussions/394606
https://github.com/Glimpse/Glimpse/issues/115
http://msdn.microsoft.com/en-us/library/vstudio/ms164311.aspx
The fix:
Add the line set MSBUILDDISABLENODEREUSE=1 to the batch file which launches msbuild
Launch msbuild with /m:4 /nr:false
The 'nr' paremeter tells msbuild to not use "Node Reuse" - so msbuild instances are closed after the build is completed and no longer clash with each other - resulting in the above errors.
The 'm' parameter is set to 4 to stop too many nodes spawning per-job

I had the same issue. One old reference I found was in csproj files
<PropertyGroup>
<StyleCopMSBuildTargetsFile>..\packages\StyleCop.MSBuild.4.7.48.0\tools\StyleCop.targets</StyleCopMSBuildTargetsFile>
Also, I deleted the entire "Packages" folder that's located in the same folder as sln file after I closed the visual studio. It triggered VS to rebuild the folder and let go of the cache of the old version of stylecop

I've had the same issue for a while, builds were taking over 6 minutes to finish after some digging I found our it's node reuse fault so adding /m:4 /nr:false fixing my issue immediately

Related

console application is not building in vs 2019

I am constantly this error in debug mode.
Severity Code Description Project File Line Suppression State Error MSB3027 Could not copy "C:\Users\N3617\Source\Repos\Core\CoreData\ConsoleApp1\obj\Debug\netcoreapp3.1\ConsoleApp1.exe" to "bin\Debug\netcoreapp3.1\ConsoleApp1.exe". Exceeded retry count of 10. Failed. The file is locked by: "ConsoleApp1 (1080)" ConsoleApp1 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets 4643
I had to restart system to get rid off this error. Can anyone tell why is this happening?
It seems an old problem that your project is locked by some other process due to some reasons. You can see this similar issue.
It is complex to explain that but you can try the following steps if you faced them next time:
Suggestion
1) open Task Manager--> shut down ConsoleApp1.exe process, any dotnet process, NET Core Host process or similar process every time when you faced this issue and then build your project again.
2) close VS, delete .vs hidden folder under solution folder, bin and obj folder and then restart VS
3) enter Tools-->Options-->Projects and Solutions-->Build and Run-->set maximum numbers of parallel project builds to 1.
4) uncheck option Use Managed Compatibility Mode under Tools-->Options-->Debugging-->General

TFS build server duplicate workspaces

Exception Message: Unable to create the workspace '9_20_NAME' due to a mapping conflict. You may need to manually delete an old workspace. You can get a list of workspaces on a computer with the command 'tf workspaces /computer:%COMPUTERNAME%'.
Details: The path D:\Builds\NAME is already mapped in workspace 9_22_NAME. (type MappingConflictException)
Exception Stack Trace: at Microsoft.TeamFoundation.Build.Workflow.Activities.TfCreateWorkspace.Execute(CodeActivityContext context)
at System.Activities.CodeActivity`1.InternalExecute(ActivityInstance instance, ActivityExecutor executor, BookmarkManager bookmarkManager)
at System.Activities.Runtime.ActivityExecutor.ExecuteActivityWorkItem.ExecuteBody(ActivityExecutor executor, BookmarkManager bookmarkManager, Location resultLocation)
So the above has been plaguing me for just over a week now and on the surface it seems like a simple issue, delete or rename the workspaces and move on. However this issue won't shift that easily.
In short I have tried the following:
Cleared Workspaces
Created new build definitions
Moved the build folder location (e.g. D:\builds\name to D:\builds\name-2)
Build machine restart
Uninstalled / Reinstalled TFS (2013 update 3)
Rebuild the build machine and restored the TFS database
I've pretty much narrowed down the issue to something within TFS itself, but for all the good will I cannot find out what.
It's worth noting that when I delete the workspaces (using TFS sidekicks) the builds will run upto a handful of times. I've not narrowed down exactly what causes change from success to failure, however I can delete all the workspaces then run the builds a couple of times without issue and then suddenly this will come back (around 2-3 builds before constant recurring failure).
My solution was to edit my build definitions > Source Settings > Build Agent Folder and change this from a hard coded value to $(SourceDir).
A team member pointed me to this answer but I'm none the wiser as to why this setting would cause this behavior.
You will need to go to the build machine, search for the old workspace that use the same build definition name, delete that one so the build can create new workspace with the same name again. Check this blog: https://mohamedradwan.wordpress.com/2015/08/25/unable-to-create-the-workspace-due-to-a-mapping-conflict/
Also, try to rename your build definition to something unique to see whether this will fix the issue. http://blog.casavian.eu/2014/04/02/build-workspace-issue/

dnu restore fails on mac

I download visual studio code for mac today. I tried to create a simple asp.net 5 web application following these instructions https://code.visualstudio.com/Docs/ASPnet5
When I open my web application folder in visual studio, it says I need to run a restore command.
I ran the dnu restore command just like the instructions tell me but it seems to always fail.
I receive different errors every time I run it. But most of them are like this one:
CACHE https://www.nuget.org/api/v2/package/System.Threading/4.0.10-beta-22816
SharpCompress.Common.ArchiveException: Could not find Zip file Directory at the end of the file. File may be corrupted.
Restore failed
There is a stack trace as well, but for brevity sake I'll omit it for now
Has anyone experienced this?
Try dnu restore --no-cache.
You may also need to remove previously downloaded files - check ~/.dnx/packages. I removed all files from that folder some time before trying the above. Also, see the comments below, if ~/.dnx/runtimes contains unexpected versions removing them may also work. Note that the current runtime version can be controlled using dnvm.
I never saw the NullReference exception, but I was getting the SharpCompress.Common.ArchiveException. I suspect there was a mismatch from what dnu thought was the cache state with the actual cache state (maybe something timed out the first time or something).

mule-deploy.properties over written when I choose Run As "Mule Application" Anypoint Studio July 2014 Release Build Id: 201407311443

Strange event is happening in a Mule project. I have the application xml which is JPC.xml. This normally appears in the mule-deploy.properties as follows
redeployment.enabled=true
encoding=UTF-8
config.resources=JPC.xml
domain=default
When I choose Run As, Mule Application Which kicks off the build in the background prior to the deploy and run. During that time the mule-deploy.properties becomes:
redeployment.enabled=true
encoding=UTF-8
config.resources=
domain=default
And when the application runs it says it is missing the mule-config.xml
What is erasing it?
I think I may have found the root of this issue.
It seems to be a bug related to jdk_1.7.0_45 having to do with xml parsing. see: What's causing these ParseError exceptions when reading off an AWS SQS queue in my Storm cluster
I noticed several errors logged in eclipse/anypoint as:
!ENTRY org.mule.tooling.core 4 0 2014-11-19 14:16:41.081
!MESSAGE Error opening resource measurement_scheduler.xml
!STACK 0
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.
I also noticed that after restarting Anypoint, I would be able to build with maven successfully and my mule-deploy.properties file would again have content. Until...at some point after several edits to things in Anypoint, I would again get mvn build that wiped out the contents of mule-deploy.properties.
I further noticed that once this problem started to happen in one project in Anypoint, it would ALSO start happening in ANY project I built in Anypoint...until restart of Anypoint.
It seems this bug in jdk 1.7.0_45 mistakenly applies this limit in the xml parser to all opened files cumulatively, instead of per file. I suspect this causes Anypoint to not finish parsing all of the xml docs that make up my app and therefore couldnt re-create the mule-deploy.properties...leaving it blank.
Upgrading to newer jdk would fix this.
Another way to work around it is to override this limit for xml parser by adding the following to ${JAVA_HOME}/jre/lib/jaxp.properties:
jdk.xml.entityExpansionLimit=0
jdk.xml.maxGeneralEntitySizeLimit=0
I am not certain that both limits need set to work-around this. Possibly only entityExpansionLimit is needed.
After making this change I am now happily able to use Anypoint again. Beware that using this work-around possibly opens you up to a denial-of-service attack through the xml parser if your same jre is used for other not-so-trusted processes.

help building castle dynamic proxy

So I pulled the source from https://svn.castleproject.org/svn/castle/DynamicProxy/trunk/
Open it up in vs.net 2008
problems:
vs.net can't open the assembly.cs
assembly signing failed
What am I doing, rather NOT doing?
Update
So I downloaded nant, setup the .bat file in my PATH so it works in cmd prompt.
I ran:
nant default.build
Getting this error:
build failed, \buildscripts\common-project.xml (48,3)
invalid element . Unknown task or datatype.
How exactly do I build the dynamicProxy project now?
update
This is what I did, see screenshot:
oh and my nant is:
#echo off
"E:\dev\tools\nant-bin\nant-0.86-nightly-2009-05-05\bin\Nant.exe" %*
http://img697.imageshack.us/img697/5623/castlebuildscreenshot.png http://img697.imageshack.us/img697/5623/castlebuildscreenshot.png
You can read the FM (how to build.txt). :)
You need to run the build script first using NAnt (http://nant.sf.net). This will generate the assembly.cs file. Take a look at the .build files in the tree to see what they are doing.
As for the assembly signing failing, check the project settings to get rid of references to CastleKey.snk. It should sign it using DynProxy.snk (in theory).
UPDATE:
The issue with NUnit is now fixed. Do a clean check out. I really have no idea why you're getting that error. Which version of NAnt are you using? Make sure you have the latest (earlier do not have support for .NET 3.5)
You should be able to just pull the source from the trunk, and build with nant (I just did that and it worked). Ok, I lied, looks like the reference to NUnit is wrong, so the unit test project will not build correctly:
BUILD FAILED - 0 non-fatal error(s), 1 warning(s)
D:\OLD\DynamicProxy\buildscripts\common-project.xml(295,5):
'nunit-console.exe' failed to start.
The system cannot find the file specified
Total time: 1.2 seconds.
BUILD FAILED
Nested build failed. Refer to build
log for exact reason.
Total time: 3.4 seconds.
However the important stuff (assemblyinfo generation) will succeed and you should be able to just open Castle.DynamicProxy2-vs2008.sln, fix the reference to the NUnit assembly hit F5 and build the code with no issues.
I just did it on a clean check out, and it worked.
Generally if you're planning to do modifications in DP codebase, it is advised to go to the Castle user group first, and discuss it there.