Lazy<T>(bool) Constructor Documentation - .net-4.0

I'm confusing myself reading Microsoft's documentation on the Lazy<T>(bool) constructor.
The parameter is described as:
isThreadSafe: true to make this instance usable concurrently by multiple threads; false to make the instance usable by only one thread at a time.
If the code I would normally write in an accessor is:
If _rulesCache Is Nothing Then
SyncLock (_lockRulesCache)
If _rulesCache Is Nothing Then
_rulesCache = New RulesCache()
End If
End SyncLock
End If
Return _rulesCache
Do I want to use True or False in the constructor of the Lazy type?
Private _rulesCache As New Lazy(Of RulesCache)(**?**)
So my accessor becomes:
Return _rulesCache.Value
1) Once the object is created, it can handle multiple thread access internally.
2) I just need to make sure that if there are multiple threads hitting the accessor close to simultaneously and the object doesn't exist, that it only gets created once.
According to the documentation, statement 1 implies that the parameter should be false. Statement 2 implies that the parameter should be true.
I feel like I'm over-thinking this and it's just making me more confused. Or are the two statements above actually at odds with each other, and I should just stick with the manual locking to manage the object instantiation?

Statement 2 is the desired interpretation. The parameter does not affect any behavior of the object after the lazy initialization is complete; it only prevents two threads from accidentally racing and instantiating it twice. You can verify that in Reflector if you're curious.

Related

Overzealous null checking of backing field in my singleton?

The code below represents a singleton that I use in my application. Lets assume that _MyObject = New Object represents a very expensive database call that I do not want to make more than once under any circumstance. To ensure that this doesn't happen, I first check if the _MyObject backing field is null. If it is, I break into a SyncLock to ensure that only one thread can get in here at a time. However, in the event that two threads get past the first null check before the singleton is instantiated, thread B would end up sitting at the SyncLock while thread A creates the instance. After thread A exits the lock, thread B would enter the lock and recreate the instance which would result in that expensive database call being made. To prevent this, I added an additional null check of the backing field which occurs within the lock. This way, if thread B manages to end up waiting at the lock, it will get through and do one more null check to ensure that it doesn't recreate the instance.
So is it really necessary to do two null checks? Would getting rid of the outer null check and just starting out with the Synclock be just the same? In other words, is thread-locking a variable just as fast as letting multiple threads access a backing field simultaneously? If so, the outer null check is superfluous.
Private Shared synclocker As New Object
Private Shared _MyObject As Object = Nothing
Public Shared ReadOnly Property MyObject As Object
Get
If _MyObject Is Nothing Then 'superfluous null check?
SyncLock synclocker
If _MyObject Is Nothing Then _MyObject = New Object
End SyncLock
End If
Return _MyObject
End Get
End Property
This will probably be better as an answer rather than a comment.
So, using Lazy to implement "do expensive operation only once, than return reference to the created instance":
Private Shared _MyObject As Lazy(Of Object) = New Lazy(Of Object)(AddressOf InitYourObject)
Private Shared Function InitYourObject() As Object
Return New Object()
End Function
Public Shared ReadOnly Property MyObject As Object
Get
Return _MyObject.Value
End Get
End Property
This is a very simple and thread-safe way of doing on-demand one time initialization. The InitYourObject method handles whatever initialization you need to do and returns an instance of the created class. On first request, the initialization method is called when you call _MyObject.Value, the subsequent requests will return the same instance.
You're absolutely right to have added the inner If statement (you would still have a race condition without it, as you correctly noted).
You are also correct that, from a purely-logical point of view, the outer check is superfluous. However, the outer null check avoids the relatively-expensive SyncLock operation.
Consider: if you've already created your singleton, and you happen to hit your property from 10 threads at once, the outer If is what prevents those 10 threads from queueing up to essentially do nothing. Synchronising threads isn't cheap, and so the added If is for performance rather than for functionality.

Synclock List or ListItem

MS reference: http://msdn.microsoft.com/en-us/library/3a86s51t(v=vs.71).aspx
"The type of the expression in a SyncLock statement must be a reference type, such as a class, a module, an interface, array or delegate."
Scenario: Multiple threads reading and editing a list.
I know this will avoid a race condition:
SyncLock TheList
TheList.item(0) = "string"
End SyncLock
But will this?
SyncLock TheList.item(0)
TheList.item(0) = "string"
End SyncLock
No, your second snippet is fundamentally wrong. Since you are replacing the object that you lock on. So another thread is going to take the lock on another object, you therefore have no thread-safety at all. A lock can only work if threads use the exact same object to store the lock state.
Notable too is the kind of object you take the lock on. Your second snippet does so on an interned string. Very, very bad since is very likely to cause deadlock. Any other code anywhere else might be wrong the same way and also take a lock on a string literal. If that happens to be "string" as well, you'll easily get completely undiagnosable deadlock.
Also the problem with your first snippet, other code might be taking a lock on the TheList object since it is probably public. Producing deadlock for the same reason. Boilerplate is that you always use a dedicated object to store the lock state that isn't used for anything else, only ever appearing in any code that accesses the list.
Private ListLock As Object = New Object

Is this simple VB.Net class thread safe? If not, how can I improve it?

Option Strict On
Public Class UtilityClass
Private Shared _MyVar As String
Public Shared ReadOnly Property MyVar() As String
Get
If String.IsNullOrEmpty(_MyVar) Then
_MyVar = System.Guid.NewGuid.ToString()
End If
Return _MyVar
End Get
End Property
Public Shared Sub SaveValue(ByVal newValue As String)
_MyVar = newValue
End Sub
End Class
While locking is a good general approach to adding thread safety, in many scenarios involving write-once quasi-immutability, where a field should become immutable as soon as a non-null value is written to it, Threading.Interlocked.CompareExchange may be better. Essentially, that method reads a field and--before anyone else can touch it--writes a new value if and only if the field matches the supplied "compare" value; it returns the value that was read in any case. If two threads simultaneously attempt a CompareExchange, with both threads specifying the field's present value as the "compare" value, one of the operations will update the value and the other will not, and each operation will "know" whether it succeeded.
There are two main usage patterns for CompareExchange. The first is most useful for generating mutable singleton objects, where it's important that everyone see the same instance.
If _thing is Nothing then
Dim NewThing as New Thingie() ' Or construct it somehow
Threading.Interlocked.CompareExchange(_thing, NewThing, Nothing)
End If
This pattern is probably what you're after. Note that if a thread enters the above code between the time another thread has done so and the time it has performed the CompareExchange, both threads may end up creating a new Thingie. If that occurs, whichever thread reaches the CompareExchange first will have its new instance stored in _thing, and the other thread will abandon its instance. In this scenario, the threads don't care whether they win or lose; _thing will have a new instance in it, and all threads will see the same instance there. Note also that because there's no memory barrier before the first read, it is theoretically possible that a thread which has examined the value of _thing sometime in the past might continue seeing it as Nothing until something causes it to update its cache, but if that happens the only consequence will be the creation of a useless new instance of Thingie which will then get discarded when the Interlocked.CompareExchange finds that _thing has already been written.
The other main usage pattern is useful for updating references to immutable objects, or--with slight adaptations--updating certain value types like Integer or Long.
Dim NewThing, WasThing As Thingie
Do
WasThing = _thing
NewThing = WasThing.WithSomeChange();
Loop While Threading.Interlocked.CompareExchange(_thing, NewThing, WasThing) IsNot WasThing
In this scenario, assuming there is some means by which, given a reference to Thingie, one may cheaply produce a new instance that differs in some desired way, it's possible to perform any such operation on _thing in a thread-safe manner. For example, given a String, one may easily produce a new String which has some characters appended. If one wished to append some text to a string in a thread-safe manner (such that if one thread attempts to add Fred and the other tries to add Joe, the net result would be to either append FredJoe or JoeFred, and not something like FrJoeed), the above code would have each thread read _thing, generate a version with its text appended and, try to update _thing. If some other thread updated _thing in the mean-time, abandon the last string that was constructed, make a new string based upon the updated _thing, and try again.
Note that while this approach isn't necessarily faster than the locking approach, it does offer an advantage: if a thread which acquires a lock gets stuck in an endless loop or otherwise waylaid, all threads will be forever blocked from accessing the locked resource. By contrast, if the WithSomeChanges() method above gets stuck in an endless loop, other users of _thing won't be affected.
With multithreaded code, the relevant question is: Can state be modified from several threads? If so, the code isn’t thread safe.
In your code, that’s the case: there are several places which mutate _MyVar and the code is therefore not thread safe. The best way to make code thread safe is almost always to make it immutable: immutable state is simply thread safe by default. Furthermore, code that doesn’t modify state across threads is simpler and usually more efficient than mutating multi-threaded code.
Unfortunately, it’s impossible to see without context whether (or how) your code could be made immutable from several threads. So we need to resort to locks which is slow, error-prone (see the other answer for how easy it is to get it wrong) and gives a false sense of security.
The following is my attempt to make the code correct with using locks. It should work (but keep in mind the false sense of security):
Public Class UtilityClass
Private Shared _MyVar As String
Private Shared ReadOnly _LockObj As New Object()
Public Shared ReadOnly Property MyVar() As String
Get
SyncLock _LockObj
If String.IsNullOrEmpty(_MyVar) Then
_MyVar = System.Guid.NewGuid.ToString()
End If
Return _MyVar
End SyncLock
End Get
End Property
Public Shared Sub SaveValue(ByVal newValue As String)
SyncLock _lockObj
_MyVar = newValue
End SyncLock
End Sub
End Class
A few comments:
We cannot lock on _MyVar since we change the reference of _MyVar, thus losing our lock. We need a separate dedicated locking object.
We need to lock each access to the variable, or at the very least every mutating access. Otherwise all the locking is for naught since it can be undone by changing the variable in another place.
Theoretically we do not need to lock if we only read the value – however, that would require double-checked locking which introduces the opportunity for more errors, so I’ve not done it here.
Although we don’t necessarily need to lock read accesses (see previous two points), we might still have to introduce a memory barrier somewhere to prevent reordering of read-write access to this property. I do not know when this becomes relevant because the rules are quite complex, and this is another reason I dislike locks.
All in all, it’s much easier to change the code design so that no more than one thread at a time has write access to any given variable, and to restrict all necessary communication between threads to well-defined communication channels via synchronised data structures.

When is a static variable created?

Let's say I have a static variable in a function:
Private Sub SomeFunction()
Static staticVar As String = _myField.Value
End Sub
When, exactly, is the value from _myField assigned to staticVar? Upon the first call to the function? Instantiation of the enclosing class?
The CLR has no support for this construct so the VB.NET compiler emulates it.
Creation first. The variable is lifted to a private field of the class with an unspeakable name that ensures no name collisions can occur. It will be an instance field if the variable is inside an instance method of the class. So will be created when an object of the class is created with the new operator. It will be a Shared field if the method is Shared or is part of a Module. So will be created in the loader heap by the jitter.
Assignment next, the much more involved operation. The language rule is that assignment occurs when code execution lands on the Dim statement for the first time. The term first time is a loaded one. There's an enormous amount of code generated by the compiler inside the method to ensure this is guaranteed. The kind of problems that need to be addressed are threading, recursion and exceptions.
The compiler creates another hidden helper field of type StaticLocalInitFlag at the same scope as the hidden variable field to keep track of initialization state for the variable. The first part of the injected code is a call to Monitor.Enter() to deal with threading. Same thing as SyncLock. The StaticLocalInitFlag serves as the locking object, note how it is a class and not just a Boolean.
Next, Try and Finally statements guard against exceptions. Inside the Try statement, the value of StaticLocalInitFlag.State is checked for 0. This protects against recursion on the same thread. State is then set to 2 to indicate that initialization code is running. Followed by the assignment. Next, State is checked again to see if it is still 2. If it is not then something went drastically wrong and an IncompleteInitialization exception is thrown.
The Finally block then sets the State to 1 to indicate that the variable is initialized. Followed by a call to Monitor.Leave().
Lots of code, 96 bytes of IL for just a simple variable. Use Static only if you don't worry about the cost.
A Static variable is instantiated when it is assigned, obviously.
When that line is run, not before, not after.
Put a breakpoint on it and you'll see.
Static just means that the value will persist between calls.
Shared class/module members are instantiated the first time the class is accessed, before any Shared constructor and before the call, whether that is to the class constructor or some Shared method/property. I think this may be the origin of your confusion. Static local variables, like local statics in c# are not instantiated in this way.
Interesting information from Tim Schmelter, it appears the value is maintained internally using a hidden Shared class level variable that is instantiated like other Shared class level variables but always with the default value.
The effect observed by you the developer is unchanged, apart from some instantiation delay when the class is accessed. This delay should be undetectable in practice.
The VB.NET compiler creates a static (shared in VB.NET) class-level variable to maintain the value of "staticVar". So it's initialized like any other static/shared variable, on the first use of a static field of that class (or when you call this method).
http://weblogs.asp.net/psteele/pages/7717.aspx

Object methods and stats - the best object oriented design approach question

I need to write some instance method, something like this (code in ruby):
def foo_bar(param)
foo(param)
if some_condition
do_bar(param)
else
do_baz(param)
end
end
Method foo_bar is a public api.
But I think, param variable here appears too many times. Maybe it would be better to create an private instance variable and use it in foo, do_bar and do_baz method? Like here: (#param is an instance variable in ruby, it can be initialized any time)
def foo_bar(param)
#param = param
foo
if some_condition
do_bar
else
do_baz
end
end
Which code is better? And why?
Is param replacing part of the state of the object?
If param is not changing the object state then it would be wrong to introduce non-obvious coupling between these methods as a convenience.
If param is altering the state of the object then it may still be bad practice to have a public api altering the state - much better to have a single private method responsible for checking and changing the state.
If param is directly setting the state of the object then I would change the instance variable here but only after checking that the new state is not inconsistent
The first version should be preferred for a couple of reasons. First, it makes testing much easier as each method is independent of other state. To test the do_bar method, simply create an instance of its containing class and invoke the method with various parameters. If you chose the second version of code, you'd have to make sure that the object had all the proper instance variables set before invoking the method. This tightly couples the test code with the object and results in broken test cases or, even worse, testcases that should no longer pass, but still do since they haven't been updated to match how the object now works.
The second reason to prefer the first version of code is that it is a more functional style and facilitates easier reuse. Say that another module or lambda function implements do_bar better than the current one. It won't have been coded to assume some parent class with a certain named instance variable. To be reusable, it will have expected any variables to be passed in as parameters.
The functional approach is the much better approach ... even in object oriented languages.
If you do not need param outside of the foo_bar method the first version is better. It is more obvious what information is being passed around and you are keeping it more thread friendly.
And I also agree with Mladen in the comment above: don't add something to the object state that doesn't belong there.