Unexpected results when using `Task.Run` to call a synchronous method - vb.net

I'm working on an application where at some point i'm doing some CPU intensive calculation on the kinematics of an object. At some point i'm using Task.Run() to call the function doing the work, this leads to some unexpected results, i'm not sure if this would be considered a race condition or if it has some other name. My real code is rather expansive, so i have reduced the issue to what i believe to be a minimal working example, to be run in a .net framework console application.
For the MWE consider the following class, it has 3 fields and a constructor initializing them. It also has a report() sub for easier debugging.
Public Class DebugClass
Public Variable1 As Double
Public Variable2 As Double
Public Variable3 As Double
Public Sub New(Variable1 As Double, Variable2 As Double, Variable3 As Double)
Me.Variable1 = Variable1
Me.Variable2 = Variable2
Me.Variable3 = Variable3
End Sub
Public Sub Report()
Console.WriteLine()
Console.WriteLine("Variable1: {0},Variable2: {1},Variable3: {2}", Variable1, Variable2, Variable3)
End Sub
End Class
I also have another helper function which replaces the CPU intensive work that my real application would have, with a random delay between 0 and 1 second :
Public Async Function RandomDelayAsync() As Task
Await Task.Delay(TimeSpan.FromSeconds(Rnd()))
End Function
For demonstration purposes I have 2 version of my "work" function; an Async an non-Async version. Each of these functions takes and instance of DebugClass as a parameter, pretends to do some work on it and then simply returns the same object that it got as an input. :
'Async version
Public Async Function DoSomeWorkAsync(WorkObject As DebugClass) As Task(Of DebugClass)
Await RandomDelayAsync()
Return WorkObject
End Function
'Synchronous version
Public Function DoSomeWork(WorkObject As DebugClass) As DebugClass
RandomDelayAsync.Wait()
Return WorkObject
End Function
Lastly, i have a WaiterLoop, this function Awaits the created tasks for one to complete, prints the returned object's fields to the console and removes it from the list of tasks. It then waits for the next one to complete. In my real application i would do some more calculations here after i get the results from the individuals tasks, to see which parameters give the best results.
Public Async Function WaiterLoop(TaskList As List(Of Task(Of DebugClass))) As Task
Dim Completed As Task(Of DebugClass)
Do Until TaskList.Count = 0
Completed = Await Task.WhenAny(TaskList)
Completed.Result.Report()
TaskList.Remove(Completed)
Loop
End Function
Now, first consider this version of my Main() function:
Sub Main()
Randomize()
Dim Tasklist As New List(Of Task(Of DebugClass))
Dim anInstance As DebugClass
For var1 As Double = 0 To 5 Step 0.5
For var2 As Double = 1 To 10 Step 1
For Var3 As Double = -5 To 0 Step 1
anInstance = New DebugClass(var1, var2, Var3)
'adding an Async task to the tasklist
Tasklist.Add(DoSomeWorkAsync(anInstance))
Next
Next
Next
WaiterLoop(Tasklist).Wait()
Console.ReadLine()
End Sub
The output here is exactly as I would expect, the tasks all complete and for each of the parameter combinations made a line is printed to the console. All's good so far, the problem i'm facing arrises when this line:
Tasklist.Add(DoSomeWorkAsync(anInstance))
Is replaced with this line
Tasklist.Add(Task.Run(Function() DoSomeWork(anInstance)))
In this new version i don't call the Async version of the work function, instead i'm using Task.Run To run a normally synchronous function on a worker thread. This is where the s**t hits the fan.
Suddenly, the output is not as expected anymore;
'This is the type of output i now get:
Variable1: 1.5,Variable2: 7,Variable3: -1
Variable1: 5,Variable2: 10,Variable3: 0
Variable1: 5,Variable2: 10,Variable3: 0
Variable1: 5,Variable2: 10,Variable3: 0
Variable1: 5,Variable2: 10,Variable3: 0
Variable1: 5,Variable2: 10,Variable3: 0
Somehow all of the tasks i created now seem to be referring to the same instance of DebugClass, as everytime a tasks completes, the same output is printed. I don't understand why this happens, because i'm creating a new instance of DebugClass before each time i start a new task: anInstance = New DebugClass(var1, var2, Var3) followed by Tasklist.Add(Task.Run(Function() DoSomeWork(anInstance))). As soon as i assign a new instance of DebugClass to AnInstance, it "forget's" the previous instance it was storing, right?. And the instance referenced by each of the created tasks ought to be independent of the ones referred to by the other tasks?
Clearly, I am mistaken, but i would appreciate it if someone could explain to me what's going on here.
I also wonder why one is faster than the other, but i will save that for a separate question unless it's related to this issue.
Thank you for taking the time to read this.

Lambdas (ie Function() DoSomeWork(anInstance)) 'close'* on a reference to a variable NOT on its current value.
Thus Function() DoSomeWork(anInstance) means 'when you come to run perform the DoSomeWork method on the current value of anInstance'.
You only have one instance of anInstance because you declared it outside the loop.
Quick fix: Move the Dim anInstance As DebugClass statement inside the inner loop, this gives you one variable instance per loop, which is what you want.
See also Captured variable in a loop in C#, which is this basically the same question in c# and has some useful discussion/links in the comments
*Closures are a big topic, I'd suggest reading https://en.wikipedia.org/wiki/Closure_(computer_programming). Happy to discuss further in comments.

Related

VB.Net Async - Check large list of string for a match

I need this function to run Async, but can't seem to figure a way to do it.
LIST1 is Public and contains a List(of String) with a few hundred entries. List Declaration:
Public LIST1 As New List(Of String)
Normally, I'd run the following code to retrieve the boolean of whether or not he list contains the entry:
Dim result = LIST1.Any(Function(s) value.ToLower.Contains(s))
Full non-Async function:
Function CheckValue(byval value As String) As Boolean
Dim result As Boolean = LIST1.Any(Function(s) value.ToLower.Contains(s))
Return result
End Function
That works well as expected.
How would I implement the same as an Async function? I've tried:
Async Function CheckValue(byval value as String) As Task(Of Boolean)
Dim result as Task(Of Boolean) = Await LIST1.Any(Function(s) value.ToLower.Contains(s))
Return result
End Function
I get the following error: 'Await' requires that the type 'Boolean' have a suitable GetAwaiter method.
Any thoughts?
It does not return a task, so there is no reason to await it. If your concern is that it is too slow, you can run any synchronous code in a new thread, and then await the completion of that thread, like this:
Dim result As Boolean = Await Task.Run(Function() LIST1.Any(Function(s) value.ToLower.Contains(s)))
Or, as GSerg mentioned, though it technically doesn't make it awaitable, you can use AsParallel.Any:
Dim result As Boolean = LIST1.AsParallel.Any(Function(s) value.ToLower.Contains(s))
However, be aware that starting new threads has a fair amount of overhead, so starting up a new thread, just to do a small amount of work, may make it actually run slower.
In this particular case, if performance is key, I would recommend looking into various search/indexing algorithms. For instance, take a look at some of the ones mentioned here. I wouldn't be surprised if there are opensource .NET libraries for those kinds of algorithms.

Return once condition is true without hanging

So I have this computation job, that requires startng 6 threads and waiting for those to finish. The threads change a "local" variable within the class.
I want to have the function return "True" once the local variable is a certain value. However, I want to do this in a fashion where it doesn't hang the thread. So a constant "Do Loop" is not gonna work. Are there any standard ways of doing this?
Public Function Start(ByVal Cores As Integer) As Boolean
For i = 0 To 10
// Heavy work
Task.Factory.StartNew(Sub() Compute(Core, StartInt, EndInt))
Next
Do // <- How to avoid checking ThreadsTerimnated = ThreadsStarted every clockcycle?
// Threading.Sleep hangs thread.
If ThreadsTerminated = ThreadsStarted Then
MergeResults(Cores)
Return True
End If
Loop
End Function
You can keep a list of your Tasks and use Tasks.WaitAll
Dim tasks As New List(Of Task)
For i = 0 To 10
// Heavy work
tasks.Add(Task.Factory.StartNew(Sub() Compute(Core, StartInt, EndInt)))
Next
Task.WaitAll(tasks.ToArray())

Execute a function in VB.NET without first declaring an instance of the class

Is there a way for me to make a function call on a new instance of a variable without first declaring it?
So for example in Java you could do:
new foo().bar(parameters);
I've tried something similar in Visual Basic, but it's a syntax error. For the moment I'm creating a variable and then running the function.
dim instance as new foo()
instance.bar(parameters)
Is there something I can do similarly to the Java code above?
Not exactly. You can do so in a larger expression by surrounding the instantiation in parenthesis, for instance:
MessageBox.Show((New String("y"c, 1)).ToUpper())
Or, in fact, while I find it more confusing, you don't actually even need the parenthesis around the instantiation:
MessageBox.Show(New String("y"c, 1).ToUpper())
However, if you want to just call a method like that, the only way I know of is to wrap in in a CType operator. For instance, if you had a class like this:
Private Class Test
Public Sub Show()
MessageBox.Show("Hello")
End Sub
End Class
You could call the Show method like this:
CType(New Test(), Test).Show()
But, it is a bit clumsy.
Actually, SSS provided an even better answer since I posted this yesterday. Instead of wrapping it in a CType operator, you can use the Call keyword. For instance:
Call New Test().Show()
Yes you can, I use the following pattern often. The alternative is to add parameters to Sub New(). I use this pattern when initialisation can fail, and you want to return a null reference (Nothing) instead of throwing an exception.
Public Class Foo
Property Prop1 As String
Property Prop2 As String
Public Shared Function Bar(p1 As String, p2 As String) As Foo
Dim f As Foo
If p1 = "" Or p2 = "" Then
'validation check - if either parameter is an empty string, initialisation fails and we return a null reference
f = Nothing
Else
'parameters are valid
f = New Foo
f.Prop1 = p1
f.Prop2 = p2
End If
Return f
End Function
''' <summary>
''' Optional - marking the Sub New as "Private" forces the caller to use the Bar() function to instantiate
''' </summary>
''' <remarks></remarks>
Private Sub New()
End Sub
End Class
Like SSS has provided the way which is best for Classes, another way around could be of using Modules if it is not a compulsion for you to use Classes only.
Modules provide easy access for function without even creating objects for them.

Assigning Function Value Instead of Using Return Keyword, and its Effect on Synchronised Code Execution

I am wondering if there is a difference between using
Public Function Foo() As Double
Return 3.0
End Function
and
Public Function Foo() As Double
Foo = 3.0
End Function
but specifically with respect to code execution.
I am attempting to manage a multithreaded application using synchronisation, and am not sure if I am capturing every lock and release correctly.
I understand that code lines after 'Return' are not executed because the function loses focus, but what if the 'Return' is wrapped in a SyncLock block?
Public Function Foo() As Double
SyncLock fooLock
Return 3.0
End SyncLock
End Function
Does the End SyncLock get called? Is the SyncLock block shorthand for:
Public Function Foo() As Double
Dim result as Double
Try
Threading.Monitor.Enter(fooLock)
result = 3.0
Finally
Threading.Monitor.Exit(fooLock)
End Try
Return result
End Function
If my understanding is correct then the Finally block comes before the function releases focus, but alternatively if the Finally waits on the Return and subsequent code, then it may be a while before the Finally gets a chance, i.e.
Public Sub DoSomething()
Dim a As Double = Foo
...Do other things
End Sub
Public Function Foo() As Double
Try
Threading.Monitor.Enter(fooLock)
Return 3.0
...code returned to executes, 'a' is assigned to the return value of Foo, then perhaps all of the other tasks on the thread are done, then
Finally
Threading.Monitor.Exit(fooLock)
End Try
End Function
In this case my lock may have held for too long. For value types the first code would be acceptible, but for reference types, the first would release the lock then return a reference to the object and the consumer would have non-synced access to the value, the second may or may not work, depending on how much code is executed inbetween the break in the function.
Could anyone help me straighten these concepts out?
There is definitely a difference between RETURN 3 and v = 3
Return X terminates the call right there, but it definitely runs through any try catch finally's you might have open.
v = 3 simply sets up the return value as 3, but does not return. Execution continues on in the function until the end of function or an exit function.
I'm not 100% sure about the synclock question, but I'd wager than RETURNing out of it would terminate it properly.
Please don't use the "assign function name a value and return" pattern anymore. Its up there with REM for some of us.
That said, if you look at the IL generated from code that uses the return vs assign function name you'll see that it is 100% the same.
As for your other question, according to MSDN:
SyncLock block guarantees release of the lock, no matter how you exit the block

Faster way to convert from a String to generic type T when T is a valuetype?

Does anyone know of a fast way in VB to go from a string to a generic type T constrained to a valuetype (Of T as Structure), when I know that T will always be some number type?
This is too slow for my taste:
Return DirectCast(Convert.ChangeType(myStr, GetType(T)), T)
But it seems to be the only sane method of getting from a String --> T. I've tried using Reflector to see how Convert.ChangeType works, and while I can convert from the String to a given number type via a hacked-up version of that code, I have no idea how to jam that type back into T so it can be returned.
I'll add that part of the speed penalty I'm seeing (in a timing loop) is because the return value is getting assigned to a Nullable(Of T) value. If I strongly-type my class for a specific number type (i.e., UInt16), then I can vastly increase the performance, but then the class would need to be duplicated for each numeric type that I use.
It'd almost be nice if there was converter to/from T while working on it in a generic method/class. Maybe there is and I'm oblivious to its existence?
Conclusion:
Testing the three provided implementations below and my original DirectCast/ChangeType form, #peenut's approach of using a prepared delegate to fetch the Parse method from a basic type works. No error checking is done, however, so implementors need to remember to only use this with valuetypes that have a Parse method available. Or extend the below to do error checking.
All runs were done on a 32bit system running Windows Server 2003 R2 with 4GB of RAM. Each "run" is 1,000,000 executions (ops) of the method to be tested, timed with StopWatch and reported back in milliseconds.
Original DirectCast(Convert.ChangeType(myStr, GetType(T)), T):
1000000 ops: 597ms
Average of 1000000 ops over 10 runs: 472ms
Average of 1000000 ops over 10 runs: 458ms
Average of 1000000 ops over 10 runs: 453ms
Average of 1000000 ops over 10 runs: 466ms
Average of 1000000 ops over 10 runs: 462ms
Using System.Reflection and calling InvokeMethod to get at the Parse method:
1000000 ops: 12213ms
Average of 1000000 ops over 10 runs: 11468ms
Average of 1000000 ops over 10 runs: 11509ms
Average of 1000000 ops over 10 runs: 11524ms
Average of 1000000 ops over 10 runs: 11509ms
Average of 1000000 ops over 10 runs: 11490ms
Konrad's approach to generate IL code to access the Parse method and store the call into a delegate:
1000000 ops: 352ms
Average of 1000000 ops over 10 runs: 316ms
Average of 1000000 ops over 10 runs: 315ms
Average of 1000000 ops over 10 runs: 314ms
Average of 1000000 ops over 10 runs: 314ms
Average of 1000000 ops over 10 runs: 314ms
peenut's approach of using a delegate to access the Parse method directly:
1000000 ops: 272ms
Average of 1000000 ops over 10 runs: 272ms
Average of 1000000 ops over 10 runs: 275ms
Average of 1000000 ops over 10 runs: 274ms
Average of 1000000 ops over 10 runs: 272ms
Average of 1000000 ops over 10 runs: 273ms
Comparatively, peenut's approach is almost 200ms faster when executed 1,000,000 times in a tight loop, so his approach wins out. Although, Konrad's wasn't far behind and is itself a fascinating study of things like ILGenerator. Props to all who contributed!
Yes, I know about faster solution :-)
Faster solution is to use prepared delegate for given (generic) Type T. If you are only interested in String->(built-in numeric type), you can simply get Parse method with one argument (String).
Program to test speed of possibilities. Note that only first two methods are generic, 3rd and 4th methods are for comparison only.
Imports System.Reflection
Module Module1
Public Class Parser(Of T As Structure)
Delegate Function ParserFunction(ByVal value As String) As T
Public Shared ReadOnly Parse2 As ParserFunction = GetFunction()
Private Shared Function GetFunction() As ParserFunction
Dim t As Type = GetType(T)
Dim m As MethodInfo = t.GetMethod("Parse", New Type() {GetType(String)})
Dim d As ParserFunction = DirectCast( _
ParserFunction.CreateDelegate(GetType(ParserFunction), m), _
ParserFunction)
Return d
End Function
Public Shared Function Parse1(ByVal value As String) As T
Return DirectCast(Convert.ChangeType(value, GetType(T)), T)
End Function
End Class
Sub Main()
Dim w As New Stopwatch()
'test data:
Dim arrStr() As String = New String(12345678 - 1) {}
Dim r As New Random
For i As Integer = 0 To arrStr.Length - 1
arrStr(i) = r.Next().ToString()
Next
Dim arrInt1() As Integer = New Integer(arrStr.Length - 1) {}
Dim arrInt2() As Integer = New Integer(arrStr.Length - 1) {}
Console.WriteLine("1. method - Convert.ChangeType:")
w.Reset()
w.Start()
For i As Integer = 0 To arrStr.Length - 1
arrInt1(i) = Parser(Of Integer).Parse1(arrStr(i))
Next
w.Stop()
Console.WriteLine(w.Elapsed)
Console.WriteLine()
Console.WriteLine("2. method - prepared delegate:")
w.Reset()
w.Start()
For i As Integer = 0 To arrStr.Length - 1
arrInt2(i) = Parser(Of Integer).Parse2(arrStr(i))
Next
w.Stop()
Console.WriteLine(w.Elapsed)
Console.WriteLine()
Console.WriteLine("3. method - Integer.Parse:")
w.Reset()
w.Start()
For i As Integer = 0 To arrStr.Length - 1
arrInt2(i) = Integer.Parse(arrStr(i))
Next
w.Stop()
Console.WriteLine(w.Elapsed)
Console.WriteLine()
Console.WriteLine("4. method - CType:")
w.Reset()
w.Start()
For i As Integer = 0 To arrStr.Length - 1
arrInt2(i) = CType(arrStr(i), Integer)
Next
w.Stop()
Console.WriteLine(w.Elapsed)
Console.WriteLine()
End Sub
End Module
You can change number of tested elements, if you want. I used 12345678 random integers. Program outputs for me:
1. method - Convert.ChangeType:
00:00:03.5176071
2. method - prepared delegate:
00:00:02.9348792
3. method - Integer.Parse:
00:00:02.8427987
4. method - CType:
00:00:05.0542241
Ratio of times: 3.5176071 / 2.9348792 = 1.20
Here’s a different approach that uses the DynamicMethod mentioned before.
Again, I couldn’t test the VB code (the Mono compiler chokes on the test call, though not on the code itself) but I believe it’s correct. Its C# equivalent works and the below code is a 1:1 translation:
public class Parser(of T as Structure)
delegate function ParserFunction(value as String) as T
private shared readonly m_parse as ParserFunction
shared sub new()
dim tt as Type = gettype(T)
dim argumentTypes as Type() = new Type() { gettype(String) }
dim typeDotParse as MethodInfo = tt.GetMethod("Parse", argumentTypes)
dim method as new DynamicMethod("Parse", tt, argumentTypes)
dim il as ILGenerator = method.GetILGenerator()
il.Emit(OpCodes.Ldarg_0)
il.Emit(OpCodes.Call, typeDotParse)
il.Emit(OpCodes.Ret)
m_parse = directcast( _
method.CreateDelegate(gettype(ParserFunction)), _
ParserFunction)
end sub
public shared function Parse(byval value As String) As T
return m_parse(value)
end function
end class
This code can be cleared up if you have a recent version of VB. Again, the Mono compiler doesn’t yet know Option Infer and the likes.
What this code does is compile a separate parsing method for each code it’s called for. This parsing method merely delegates the actual parsing to the shared T.Parse method of the type (e.g. Integer.Parse). Once compiled, this code doesn’t require any additional casting, no boxing and no Nullables.
The code is called as follows:
Dim i As Integer = Parser(Of Integer).Parse("42")
Apart from the one-time overhead for compilation, this method should be the fastest possible since there is no other overhead: just a function call to the actual parsing routine. It doesn’t get faster than that.
Probably not an answer, another question instead. What will you achieve by having this method. Let's imagine you somehow implemented such method (Sorry for C# in Vb.Net post, but hopefully you'll get the idea):
T Convert<T>(string strInput) { ... }
and you will use this method only for limited range of types: double, int, Int16, etc. So you will use it like this:
double x = Convert<double>(myStr);
I do not see any benefit in such method, because for the same reason without that method you would write:
double x = double.Parse(myStr);
So what I'm trying to say is that without your magic method you will write the same amount of code to use it. I do not see any benefit of that method. Am I missing some use-case?
I don’t have a VB compiler so I can’t test it but the following works in C#. I doubt that it’s faster though, since it’s using reflection:
Public Shared Function Parse(Of T As Strcture)(ByVal value As String) As T
Dim type = GetType(T)
Dim result = type.InvokeMember( _
"Parse", _
BindingFlags.Public Or BindingFlags.Static Or BindingFlags.InvokeMethod, _
Nothing, Nothing, new Object() { value })
Return DirectCast(result, T)
End Function
One way to speed this up is to create a dynamic method from the T.Parse member instead of InvokeMember and cache that dynamic method for each type. This would mean a bigger overhead for the first call (compiling a dynamic method) but subsequent runs would be faster.