Can I safely check when a COM object has been released in VBA? - vba

I've been reviewing some code for creating weak-references in VBA by manually dereferencing object pointers without calling IUnknown::AddRef, and I've found a bug that I can't explain. I could come up with a minimal reproducible example using the pure API calls, but I think it's easier just to demonstrate using the WeakReference class from that review. Here's how I can crash Excel:
Dim strongRef As Range
Set strongRef = [A1]
Dim weakRef As New WeakReference
weakRef.Object = strongRef
Debug.Assert strongRef.address = weakRef.Object.address 'fine
Set strongRef = Nothing
Debug.Assert weakRef.Object Is Nothing 'fine if step through, crashes if F5
This is buggy behaviour; the WeakReference class is designed in such a way that once the parent reference is destroyed, the weak reference should return Nothing rather than attempt blindly to dereference the parent ObjPtr which would now be pointing to an invalid object instance. The way it does this is explained in detail in the linked question, but essentially it caches the parent object's VTable pointer, then uses this to check the VTable pointer is still valid before every dereference. Basically the class relies on the fact that when the parent object goes out of scope, its memory is reclaimed and so the VTable pointer is overwritten with something else.
That should stop this kind of bug. However it doesn't, and I'm wondering why...
It was my understanding that
Set strongRef = Nothing
Calls IUnknown::Release
This sets the ref count to zero, the object goes out of scope
The object is responsible for releasing its own instance memory, so it uses the this pointer (first arg to IUnknown::Release) to zero the instance memory (including the VTable pointer) and free it for use by the VBA memory allocator again
Finally the value at VarPtr(strongRef) is set to zero to indicate it is a null object reference
However I think the bug is happening because the instance memory is not reset as soon as the reference count hits zero, so perhaps VBA's implementation of IUnknown::Release marks the memory as "dirty" to be cleared up at a later date by an asynchronous garbage collector? I'm just guessing here. The thing is, if I step through the code line by line then it works fine, or if you hold the WeakReference in a child class then it works fine (see the examples in the linked post).
UPDATE
I just tried, with a custom VBA class for strongRef, e.g.
Class1
Option Explicit
Static Property Get address() As Double
Dim value As Double
If value = 0 Then value = [RandBetween(1,1e10)]
address = value
End Property
...then I don't get a crash! So it's definitely something to do with specific implementations of IUnknown::Release and is probably why the author of that code never noticed the bug.

Related

Access an application and instantiate an object

I am trying to create an instance of a specific class called ExtraScreen from a referenced library application called EXTRA. How can I use the SendKeys function from the ExtraScreen class?
So far I tried this:
Dim software As EXTRA.ExtraScreen
software.SendKeys ("a")
The result is Error:
Object variable or With block variable not set.
You have to also Set it to something:
Dim software As EXTRA.ExtraScreen
Set software = CreateObject("EXTRA.ExtraScreen")
or
Dim software As New EXTRA.ExtraScreen
You have declared an object variable with a specific, early-bound type - which means if the code can compile & run, then the project has a reference to the type library.
Dim software As EXTRA.ExtraScreen
Dim statements aren't executable: you can't put a breakpoint on a Dim statement. All they do is allocate a spot of a given size in memory. In this case, it reserves a spot wide enough to hold an object reference - and nothing else.
On execution, the first statement to execute is this:
software.SendKeys ("a")
But the problem is, if you put a breakpoint here and inspect the Locals toolwindow, you'll find that the software object contains Nothing. In other languages this is known as a "null reference" - the object variable is not set: there's a reserved spot for holding an object reference, but the spot is empty.
You use the New keyword to create an instance of a class - i.e. to create an object. And in VBA object reference assignments require the Set keyword:
Set software = New EXTRA.ExtraScreen
Now if you run that line and inspect your locals, you'll find that software isn't Nothing anymore, and you can inspect its state / properties.
Once an object variable holds a proper object reference, you can legally invoke its members:
software.SendKeys "a"
You can never invoke anything on a Nothing object reference: an object variable that is Nothing is, well, nothing: it's not an object, therefore it has no members to invoke. VBA runtime responds by throwing run-time error 91 "Object (or With block variable) not Set".
The "With block variable" part is referring to the With keyword, which can also hold objects. For example you could do this:
With software
.SendKeys "a"
End With
And you'd get the exact same error if software isn't Set. Consider dropping the local variable altogether, if it only ever needs to live as a local variable inside some specific procedure:
With New EXTRA.ExtraScreen
.SendKeys "a"
End With
In this case the With block is holding the object reference; at End With, the object is gone (avoid jumping in & out of With blocks, specifically for that reason).

Declaring Variables Memory Leaks

I am wondering what would be the most correct way to deal with memory when using VBScript. Should declare all variables right before I use them? The beginning of the program? I understand global vs local, however in my script all variables are local. I know that memory leaks will never be a problem when writing in VBScript 99.9% of the time, but I am also curious as to the 'best' way to clear and release memory within a script. By 'best' I mean, the timing of clearing variables/objects (right after you are done using them vs the end of the script), etc.
An example:
Dim fso: Set fso = CreateObject("Scripting.FileSystemObject")
Dim arrList : Set arrList = CreateObject("System.Collections.ArrayList")
Dim objDict: Set objDic = CreateObject( "Scripting.Dictionary")
Dim objEmail : Set objEmail = CreateObject("CDO.Message")
Dim someArray(), x, y, z, item
It's best practice to declare all variables, but not for the reason you assume. VBScript is sufficiently good at cleaning up after itself, so memory leaks usually aren't an issue. Most of the time you don't even need to release objects (Set var = Nothing) because they're automatically destroyed when leaving the context.
The reason why you still want to declare your variables is that you want to use Option Explicit in your scripts (which enforces variable declarations), so that you can avoid problems due to mistyped or otherwise uninitialized variables. Without Option Explicit VBScript would automagically create missing variables and initialize them with an empty/zero value. Silly example:
Dim foo : foo = 3
Dim bar : bar = 1
Do
bar = bar + fo 'mistyped variable, initilized as empty/0
Loop Until bar > 10
WScript.Echo bar
Running the above would create an infinite loop. If you add Option Explicit the script will instead immediately terminate with a runtime error:
C:\path\to\your.vbs(5, 3) Microsoft VBScript runtime error: Variable is undefined: 'fo'
The VBScript garbage collector runs at the end of every line to clear implicit variables and at the end of every procedure (end sub, end function, and end property) to clear explicit variables. Objects are similar but have added constraints. It works similar to VBA's garbage collector. By contrast JScript waits until 30,000 objects have gone out of scope before running and freeing memory.
An implicit variable is an unnamed variable - msgbox LCase(UCase("String") has two implicit variables - the result of UCase("String") and that is passed to LCase(implicitVar1) which returns implicitVar2 which is passed to Msgbox. An Explict variable is declared either by DIM or just by using it as in A=5 which creates an explicit variable called A.
VBScript, on the other hand, has a much simpler stack-based garbage collector. Scavengers are added to a stack when they come into scope, removed when they go ou t of scope, and any time an object is discarded it is immediately freed.
https://blogs.msdn.microsoft.com/ericlippert/2003/09/17/how-do-the-script-garbage-collectors-work/
VBScript’s garbage collector is completely different. It runs at the end of every statement and procedure, and does not do a search of all memory. Rather, it keeps track of everything allocated in the statement or procedure; if anything has gone out of scope, it frees it immediately
https://blogs.msdn.microsoft.com/ericlippert/2004/12/22/t4-vbscript-and-the-terminator/
Also
https://blogs.msdn.microsoft.com/ericlippert/2004/04/28/when-are-you-required-to-set-objects-to-nothing/
https://blogs.msdn.microsoft.com/ericlippert/2004/03/01/syntax-semantics-micronesian-cults-and-novice-programmers/
The CPU is a stack based machine (and VBScript a stack based virtual machine). When the CPU calls a function the calling program puts the parameters on the stack and the return address, adjust the stack frame and does a jump. The callee function creates local variables on the stack and also places the return value on it. When it returns the stack pointer is adjusted back to where it was which automatically frees all the above.

Instantiating a variable with Nothing, then assigning a New object instance

Looking through some old VB.Net code, I noticed a strange pattern that is making me scratch my head.
Dim objMyObject As Namespace.Child.ChildType = Nothing
objMyObject = New Namespace.Child.ChildType
(There is no additional code between the dimension and the assignment.)
It seems like the preferred style would be to do both on one line, or else skip the = Nothing. As follows:
Dim objMyObject As Namespace.Child.ChildType = New Namespace.Child.ChildType
OR
Dim objMyObject As Namespace.Child.ChildType
objMyObject = New Namespace.Child.ChildType
OR, as suggested by #helrich
Dim objMyObject As New Namespace.Child.ChildType
Is there any particular value to doing it this way, or is this an instance of the original programmer being used to the VB6 way of doing things?
In VB6, dimensioning and instantiating a variable on one line was considered problematic because it would rerun the instantiation (if necessary) when the variable was accessed - effectively, a variable dimensioned in this way could never be tested for Nothing, because a new instance would be created on demand. However, VB.Net does not preserve this convention.
No, this is pointless. The CLR already provides a hard guarantee that variables are initialized to Nothing.
It is otherwise completely harmless, the jitter optimizer will completely remove the code for the assignment. So if the original author preferred that style then that's okay. Maybe he was a former C# programmer that didn't understand the definite assignment rules in that language. VB.NET does some checking too but it isn't nearly as strict. Do check if this is a team standard that you are supposed to follow as well, hopefully not.
In the first example, there's no need to separate the declaration and assignment.
But I was wondering here (a hypothesis): Since you should split this way when you want to persist the variable in the stack when it is assigned in a code block (e.g: If statement), maybe once upon a time this block existed and it was removed keeping a constant association to it.
Its association, though, was not merged with its declaration.
About associating Nothing to an empty variable: I personally like this pattern. :)
It tells myself (in future maintainances) that the variable was declared with an empty (null) value on purpose. It eliminates the doubt that I, maybe, forgot to write the New keyword behind the type.
Ahh, and it will also eliminate a vb.net warning during build.

When should an Excel VBA variable be killed or set to Nothing?

I've been teaching myself Excel VBA over the last two years, and I have the idea that it is sometimes appropriate to dispose of variables at the end of a code segment. For example, I've seen it done in this bit adapted from Ron de Bruin's code for transferring Excel to HTML:
Function SaveContentToHTML (Rng as Range)
Dim FileForHTMLStorage As Object
Dim TextStreamOfHTML As Object
Dim TemporaryFileLocation As String
Dim TemporaryWorkbook As Workbook
...
TemporaryWorkbook.Close savechanges:=False
Kill TemporaryFileLocation
Set TextStreamOfHTML = Nothing
Set FileForHTMLStorage = Nothing
Set TemporaryWorkbook = Nothing
End Function
I've done some searching on this and found very little beyond how to do it, and in one forum post a statement that no local variables need to be cleared, since they cease to exist at End Sub. I'm guessing, based on the code above, that may not be true at End Function, or in other circumstances I haven't encountered.
So my question boils down to this:
Is there somewhere on the web that explains the when and why for variable cleanup, and I just have not found it?
And if not can someone here please explain...
When is variable cleanup for Excel VBA necessary and when it is not?
And more specifically... Are there specific variable uses (public variables?
Function-defined variables?) that remain loaded in memory for longer
than subs do, and therefor could cause trouble if I don't clean
up after myself?
VB6/VBA uses deterministic approach to destoying objects. Each object stores number of references to itself. When the number reaches zero, the object is destroyed.
Object variables are guaranteed to be cleaned (set to Nothing) when they go out of scope, this decrements the reference counters in their respective objects. No manual action required.
There are only two cases when you want an explicit cleanup:
When you want an object to be destroyed before its variable goes out of scope (e.g., your procedure is going to take long time to execute, and the object holds a resource, so you want to destroy the object as soon as possible to release the resource).
When you have a circular reference between two or more objects.
If objectA stores a references to objectB, and objectB stores a reference to objectA, the two objects will never get destroyed unless you brake the chain by explicitly setting objectA.ReferenceToB = Nothing or objectB.ReferenceToA = Nothing.
The code snippet you show is wrong. No manual cleanup is required. It is even harmful to do a manual cleanup, as it gives you a false sense of more correct code.
If you have a variable at a class level, it will be cleaned/destroyed when the class instance is destructed. You can destroy it earlier if you want (see item 1.).
If you have a variable at a module level, it will be cleaned/destroyed when your program exits (or, in case of VBA, when the VBA project is reset). You can destroy it earlier if you want (see item 1.).
Access level of a variable (public vs. private) does not affect its life time.
VBA uses a garbage collector which is implemented by reference counting.
There can be multiple references to a given object (for example, Dim aw = ActiveWorkbook creates a new reference to Active Workbook), so the garbage collector only cleans up an object when it is clear that there are no other references. Setting to Nothing is an explicit way of decrementing the reference count. The count is implicitly decremented when you exit scope.
Strictly speaking, in modern Excel versions (2010+) setting to Nothing isn't necessary, but there were issues with older versions of Excel (for which the workaround was to explicitly set)
I have at least one situation where the data is not automatically cleaned up, which would eventually lead to "Out of Memory" errors.
In a UserForm I had:
Public mainPicture As StdPicture
...
mainPicture = LoadPicture(PAGE_FILE)
When UserForm was destroyed (after Unload Me) the memory allocated for the data loaded in the mainPicture was not being de-allocated. I had to add an explicit
mainPicture = Nothing
in the terminate event.

VB.NET, Is Object Returned by Reference from Function

This should be a fairly common question, but I haven't found a straightforward answer anywhere.
If I instantiate an object within a function in VB.NET and return it, does it return it by reference or by value. IE - should I be worried about performance if I write something like this:
Public Function ret_obj_func() As big_object
Dim ret_obj As New big_obj(<lots of stuff>)
Return ret_obj
End Function
If I call this function from somewhere else, will it instantiate the object in the ret_obj and then create a deep copy to pass back a copy to the caller, Or will it just pass back a reference?
There are two dichotomous issues here with similar vocabulary involved: value versus reference types, and passing variables by value versus by reference.
Value v. Reference Types
The first issue is value versus reference types. Value types are passed around through copying - usually. The value types are:
Date
Char
U/Int(16/32/64)
Decimal
Single and Double
Boolean
Structs
Enums
All but the above-listed types are reference types. When an object gets passed around, what is actually being passed is its memory address, which is often thought of as an int on 32 bit platforms and a long on 64 bit platforms.
Passing by Value v. By Reference
The second issue is passing a variable by value versus reference.
A variable is a slot at a certain position in memory that can hold stuff. For value types, it holds the actual value. For reference types, it holds the memory address of the object on the heap (or is Nothing).
By Value
When you pass a variable by value, whatever is at that variable's memory location is what gets copied. For value types, that means the value itself is copied. For reference types, what gets copied is the memory address of the object refered to by the variable.
By Reference
Remember that a variable is just a slot in memory for holding stuff. When you pass a variable by reference, you are passing the address of that slot (as opposed to the data in that slot).
If that variable is a value type, that slot holds the value itself, so the thing being passed is a pointer to the value.
If that variable is a reference type, the slot is holding a pointer to the object's location in memory, so the thing being passed is a pointer to your variable (just like with value types), which itself contains another pointer (not like value types) which leads to the memory location that holds the object referred to by the variable.
This allows a function to modify a variable in another function, like this:
Sub IPassByReference
Dim myVariable As Boolean = False
IReceiveByReference myVariable
Debug.Print(myVariable.ToString()) 'Always prints True
End Function
Sub IReceiveByReference(ByRef flag As Boolean)
flag = True 'the memory address of myVariable was passed.
End Function
Let's compare to the situation where you pass by value:
Sub IPassByValue
Dim myVariable As Boolean = False
IReceiveByValue myVariable
Debug.Print(myVariable.ToString()) 'Always prints False
End Function
Sub IReceiveByValue(ByVal flag As Boolean)
flag = True 'the value of myVariable was passed.
End Function
In the above example, Boolean is a value type. If it were an object, IReceiveByReference would have the power to point myVariable to an entirely new object, because it received the address of myVariable - not the address of the object to which myVariable points. By contrast, IReceiveByValue was only passed the contents of myVariable, so it cannot change myVariable to point to a new object. It could still change the object by setting its fields and properties and calling its methods, though.
Return By-Reference?
Although functions can pass variables by reference, they cannot return them that way. When a function returns, its local variables do not exist anymore (or are pending cleanup if they are heap-allocated). Functions therefore always return by value; because the local variables don't have valid memory addresses anymore, there aren't any variable references to return.
Putting it all together, when you return an object from a function, the only thing that is copied is the address of the object. When you return a value type from a function, the value itself is copied.
This means reference types are essentially value types, wherein the value is the memory address of an object on the heap, (or Nothing).
It just passes back a reference (assuming big_obj is a class). I wouldn't use the term "by reference" here, as that has a subtly different meaning when it comes to parameter passing - but assuming big_obj is a class - a reference type - the value of ret_obj is a reference, and that reference will be what's returned.
I don't have any articles on this from a VB perspective, but if you're happy to look at C#, you may find these articles useful:
Reference and value types
Parameter passing in C#
VB.NET does not have the ability to return by reference. Neither does C# for that matter, but it has been proposed. What you actually get back is just a reference to the object. So to precisely define this it returns a reference to the object. It does not return by reference like what you might compare to the ByRef keyword.