NUnit async test causing AppDomainUnloadedException - wcf

I have a .NET 4.5 WCF service with async operations. I have integration tests which constructs the service host using NetNamedPipeBinding and hits the operation via a client.
However, each test like this always causes NUnit to report the following:
System.AppDomainUnloadedException: Attempted to access an unloaded AppDomain.
This can happen if the test(s) started a thread but did not stop it.
Make sure that all the threads started by the test(s) are stopped before completion.
Everything looks ok to me. Can anyone see what might be causing this? I have a complete code sample on GitHub: https://github.com/devlife/codesamples

I'm having the same problem. It looks as if the issue are "lenient" completion port threads (in the ThreadPool) that have been used by WCF to handle async IO.
When ServiceHost.Close() is used, it will signal all those threads that work is done, but they won't go away immediately, that is, they may outlive the end of the ServiceHost.Close() operation. Thus, the "shutdown" procedure races with the actual AppDomain unloading induced by NUnit due to the end of the test run.
Basically, a simple Thread.Sleep(<a couple of seconds>) after a ServiceHost.Close() "fixes" this :-)
After much searching around on the internet I couldn't find a robust solution for this issue (for a selection of similar issues, not all due to the same cause though, google "unit test appdomainunloadedexception"), short of having some way to suppress this warning itself.
I tried different bindings and transports (includind the NullTransport), but to no avail.
In the end I settled with this "solution":
static void PreventPrematureAppDomainUnloadHack()
{
//
// When NUnit unloads the test AppDomain, the WCF started IO completion port threads might
// not have exited yet.
// That leads to AppDomainUnloadedExceptions being raised after all is said and done.
// While native NUnit, ReSharper oder TestDriven.NET runners don't show these, VSTest (and
// TFS-Build) does. Resulting in very annoying noise in the form of build/test warnings.
//
// The following code _attempts_ to wait for all completion port threads to end. This is not
// an exact thing one can do, however we mitigate the risk of going wrong by several factors:
// (1) This code is only used during Unit-Tests and not for production code.
// (2) It is only called when the AppDomain in question is about to go away anway.
// So the risk of someone starting new IO threads while we're waiting is very
// low.
// (3) Finally, we have a timeout in place so that we don't wait forever if something
// goes wrong.
//
if (AppDomain.CurrentDomain.FriendlyName.StartsWith("test-domain-", StringComparison.Ordinal))
{
Console.WriteLine("AppDomainUnloadHack: enabled (use DbgView.exe for details).");
Trace.WriteLine(string.Format("AppDomainUnloadHack: enabled for domain '{0}'.", AppDomain.CurrentDomain.FriendlyName));
AppDomain.CurrentDomain.DomainUnload += (sender, args) =>
{
int activeIo;
var sw = Stopwatch.StartNew();
var timeout = TimeSpan.FromSeconds(3);
do
{
if (sw.Elapsed > timeout)
{
Trace.WriteLine("AppDomainUnloadHack: timeout waiting for threads to complete.");
sw.Stop();
break;
}
Thread.Sleep(5);
int maxWorkers;
int availWorkers;
int maxIo;
int availIo;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIo);
ThreadPool.GetAvailableThreads(out availWorkers, out availIo);
activeIo = maxIo - availIo;
Trace.WriteLine(string.Format("AppDomainUnloadHack: active completion port threads: {0}", activeIo));
} while (activeIo > 0);
Trace.WriteLine(string.Format("AppDomainUnloadHack: complete after {0}", sw.Elapsed));
};
}
}
The timeout of 3 seconds is totally arbitrary and so is the wait of 5ms between each retry. Sometimes I do get a "timeout", but most of the time it works.
I make sure that this code is called once for every test assembly (i.e. through a static ctor of a referenced type).
As usual in such cases YMMV.

Related

Kotlin co-routine executes sequentially, but only on production machine

I have a bunch of network requests I want to conduct in parallel.
The following pseudo code should give a good idea of what I'm doing right now:
runBlocking {
buildList {
withContext(tracer.asContextElement()) {
items.forEach { item ->
add(
async {
// a few IO intensive operations (i.e. network requests)
}
)
}
}
}.awaitAll()
}
I have tracing tools set up and locally this seems to do the job. In my production infrastructure however the async tasks execute sequentially, i.e. the second one starts immediately after the first one finishes.
I have also tried using withContext(Dispatchers.IO.plus(tracer.asContextElement())) but I observe no difference.
The only thing I can say is that my development machine has multiple CPU cores, and my production machine will normally have 1. Regardless, due to the IO heavy nature of these processes I doubt this is the problem. I can't really explain what is causing this, but my gut feeling is that I'm fundamentally not understanding something about how Coroutines work in Kotlin.
As to the nature of the network request in question, I'm using a third party SDK that asynchronously executes the request, and seems to use ForkJoinPool.commonPool() under the hood as an executor.
If you don't switch dispatchers here, all those coroutines will run in the same thread - the one blocked by runBlocking. If the computation inside each coroutine is blocking, they will block the only thread one by one without any way to parallelize. This would explain what you're seeing (although it's strange that you don't reproduce locally).
I have also tried using withContext(Dispatchers.IO.plus(tracer.asContextElement())) but I observe no difference.
Your fix should work, unless the IO you're performing is actually managing threads itself and also confining the execution to a single thread no matter where it's called from. Maybe you should look into the actual IO then.
EDIT: you mentioned that you perform the IO operations via a third party SDK that uses the common ForkJoinPool - this one is backed by a single thread on a single-CPU machine, so this explains why the calls aren't parallelized in your single-CPU production machine. The only options to fix that would be:
check whether the SDK you're using allows to customize the backing pool of threads
customize the size of the ForkJoinPool using the JVM property java.util.concurrent.ForkJoinPool.common.parallelism
use another SDK :)
You still need to customize the dispatcher in addition to that if you're calling the library in a blocking way, but not if you're converting their async tasks into suspensions using Future.await() or similar.
Now, a few other things to note in this code:
you don't need buildList { .. }, you can just use map { thing } instead of forEach { add(thing) } and you'll get the resulting list as a return value (it also works across withContext, because it returns the lambda result)
withContext actually waits for all child coroutines to finish, so awaitAll() is misplaced here (it should rather be inside withContext)
actually, you probably don't need withContext at all, you can pass the custom context directly to runBlocking, unless you have other things in runBlocking that you don't want to run in this context
(optional) if the IO computations don't return results, you don't need awaitAll at all, and you could just use launch instead.
Assuming you do need the result, so ignoring the last point, your current code (with dispatcher fix) could be rewritten to:
val results = runBlocking(Dispatchers.IO + tracer.asContextElement()) {
items.map { item ->
async {
performIO(item)
}
}.awaitAll()
}
Otherwise:
runBlocking(Dispatchers.IO + tracer.asContextElement()) {
items.map { item ->
launch {
performIO(item)
}
}
}

IHostedService StartAsync called randomely and causes Wrong Timer intervals

I'm using IHostedService back since Asp.net core version 2.1.
I noticed in my logs that StartAsync is called sometimes alot in very messy intervals from 8 mins to one hour and that calls my DoWorkAsync each time.
I have a long process that I don't want it to be recalled on those small intervals It runs in a timer each two hours in normal cases using Timer.
public Task StartAsync(CancellationToken cancellationToken)
{
_logger.LogInformation("Timed Background Service is starting.");
_timer = new Timer(DoWorkAsync, null, TimeSpan.FromMinutes(2),
TimeSpan.FromHours(2));
return Task.CompletedTask;
}
I'm considering using lock statement but if I made the lock on private object would it be there to lock execution when StartAsync called again.
I'm concerned because the process calls WSI (Webservice) on another server , and afraid it might be recalled before previous call answered and make the other server crashed.
My logging is simply a text file that log times in StartAsync and DoWorkAsync.
I'm running this on Aws windows instance. If the problem might be crashes or self restart how would I see the causes of it. I don't think my simple text file would catch it.
In my opinion the StartAsync() is not supposed to hold a callback object and probably cause your troubles. Pls consider this example: https://learn.microsoft.com/en-us/dotnet/core/extensions/timer-service. It does, what you want to do, but in the right manner.

Service Fabric self-deleting service

I'd like to add a service that executes some initialization operations for the system when it's first created.
I'd imagine it would be a stateless service (with cluster admin rights) that should self-destruct when it's done it's thing. I am under the impression that exiting the RunAsync function allows me to indicate that I'm finished (or in an error state). However, then it still hangs around on the application's context and annoyingly looking like it's "active" when it's not really doing anything at all.
Is it possible for a service to remove itself?
I think maybe we could try using the FabricClient.ServiceManager's DeleteServiceAsync (using parameters based on the service context) inside an OnCloseAsync override but I've not been able to prove that might work and it feels a little funky:
var client = new FabricClient();
await client.ServiceManager.DeleteServiceAsync(new DeleteServiceDescription(Context.ServiceName));
Is there a better way?
Returning from RunAsync will end the code in RunAsync (indicate completion), so SF won't start RunAsync again (It would if it returned an exception, for example). RunAsync completion doesn't cause the service to be deleted. As mentioned, for example, the service might be done with background work but still listening for incoming messages.
The best way to shut down a service is to call DeleteServiceAsync. This can be done by the service itself or another service, or from outside the cluster. Services can self-delete, so for services whose work is done we typically see await DeleteServiceAsync as the last line of RunAsync, after which the method just exits. Something like:
RunAsync(CancellationToken ct)
{
while(!workCompleted && !ct.IsCancellationRequested)
{
if(!DoneWithWork())
{
DoWork()
}
if(DoneWithWork())
{
workCompleted == true;
await DeleteServiceAsync(...)
}
}
}
The goal is to ensure that if your service is actually done doing the work it cleans itself up, but doesn't trigger its own deletion for the other reasons that a CancellationToken can get signaled, such as shutting down due to some upgrade or cluster resource balancing.
As mentioned already, returning from RunAsync will end this method only, but the service will continue to run and hence not be deleted.
DeleteServiceAsync certainly is the way to go - however it's not quite as simple as just calling it because if you're not careful it will deadlock on the current thread (especially in local developer cluster). You would also likely get a few short-lived health warnings about RunAsync taking a long time to terminate and/or target replica size not being met.
In any case - solution is quite simple - just do this:
private async Task DeleteSelf(CancellationToken cancellationToken)
{
using (var client = new FabricClient())
{
await client.ServiceManager.DeleteServiceAsync(new DeleteServiceDescription(this.Context.ServiceName), TimeSpan.FromMinutes(1), cancellationToken);
}
}
Then, in last line of my RunAsync method I call:
await DeleteSelf(cancellationToken).ConfigureAwait(false);
The ConfigureAwait(false) will help with deadlock issue as it will essentially return to a new thread synchronization context - i.e. not try to return to "caller context".

How to unit test a method that waits for a response to an asynchronous message

I have a WCF service that sends a message to a remote device via an asynchronous messaging protocol (MQTT), then it has to wait for a response from the device in order to simulate a synchronous operation.
I'm doing this by creating a TaskCompletionSource (and a CancellationTokenSource to handle a timeout), storing it in a ConcurrentDictionary, then returning the TCS.Task.Result once it is set. Meanwhile, when the response comes in, a different method handles the response by looking up the TCS in the dictionary and setting its Result accordingly.
This all seems to work in practice, but I'm running into issues when trying to unit test this method. I'm trying to set up an asynchronous task that waits for the SendMessage method to generate the TCS and add it to the dictionary, then simulates a response by pulling it out of the dictionary and setting the result before the timeout elapses.
For the purposes of the unit test, I'm using a timeout period of 500 ms. I've tried this:
Task.Run(() =>
{
Thread.Sleep(450);
ctsDictionary.Values.Single().SetResult(theResponse);
});
MessageResponse response = service.SendMessage(...);
I've also tried this:
MessageResponse response = null;
Parallel.Invoke(
async () =>
{
await Task.Delay(250);
ctsDictionary.Values.Single().SetResult(theResponse);
},
() =>
{
response = service.SendMessage(...)
}
);
Both of these strategies work fine when running just this one unit test or even when running all tests in this Unit Test class.
The problem comes when running all unit tests for the solution (2307 tests in total across a couple dozen UnitTest projects). This test consistently fails with the SendMessage method timing out before the response gets set by the asynchronous task, when part of a "Run All Tests" operation. Presumably this is because the scheduling of the tasks is thrown off by all the other unit tests that are being run in parallel, and the timing doesn't end up working out. I've tried playing around with the delay on the task as well as increasing the timeout period considerably, but I still can't get it to consistently pass when all tests are run.
So how can I fix this? Is there some way I can ensure that the SendMessage call and the task that sets the response are scheduled to run at the exact same time? Or is there some other strategy I can use to ensure the timing works out?
then it has to wait for a response from the device in order to simulate a synchronous operation.
That's hokey, man. Just have to say it - keep it async. Not only would it be more natural, it would be easier to unit test!
You can minimize the time SendMessage is waiting by first queueing up the SendMessage and then fast-polling for the request to hit the dictionary. That's as tight as you can get it without changing SendMessage (e.g., making it async):
// Start SendMessage
var sendTask = Task.Run(() => service.SendMessage(...));
// SendMessage may not actually be running yet, so we busy-wait for it.
while (ctsDictionary.Values.Count == 0) ;
// Let SendMessage know it can return.
ctsDictionary.Values.Single().SetResult(theResponse);
// Retrieve the result.
var result = await sendTask;
If you still have problems getting in before the timeout, you'll just have to throttle your unit tests (e.g., SemaphoreSlim).
Again, this would be much easier if SendMessageAsync existed with the semantics that it synchronously adds to the dictionary before it awaits:
// Start SendMessage
var sendTask = service.SendMessageAsync(...);
// Let SendMessage know it can return.
ctsDictionary.Values.Single().SetResult(theResponse);
// Retrieve the result.
var result = await sendTask;
No busy-waiting, no delays, no extra threads. Much cleaner.

Using Rx to call a WCF service method Async is causing Closure problems

I am currently using this code tu call a webservice to get data for an application page.
Everything is fine until I try to call this method 10 times in a row without waiting for the first call to finish.
Doing so is causing me to have a problem with closures and I get the same result object for all my results.
has anyone faced this with Rx.Net before? if so does anyone have an idea or a lead so that I may resolve this issue.
public void GetPage(long pageId, Action<PageDTO> observer)
{
Observable.FromEvent<GetPageCompletedEventArgs>(
handler => Service.GetPageCompleted += handler,
handler => Service.GetPageCompleted -= handler)
.Select(eventHandler => eventHandler.EventArgs.Result)
.Take(1) // necessary to ensure the observable unsubscribes
.ObserveOnDispatcher() // controls which thread the observer runs on;
.Subscribe(observer, HandleError);
Service.GetPageAsync(pageId);
}
Is Service always the same instance? If so, you're going to run into all kinds of craziness whereby GetPageCompleted events will be handled by the FromEvent created by a different call (with different arguments), which would explain why your results are the same for all methods that were called at the same time.
You can get around this specific issue by using the Begin/End methods, though you will still likely run into problems with contention on the underlying connection.
public void GetPage(long pageId, Action<PageDTO> observer)
{
Observable.FromAsyncPattern<long, PageDTO>(
service.BeginGetAwards, service.EndGetAwards)(pageId)
.ObserveOnDispatcher() // controls which thread the observer runs on;
.Subscribe(observer, HandleError);
}