VBA Garbage Collector Details - vba

I've found myself having to write some VBA code recently and just wondered if anyone had ever come across any details on how the VBA garbage collector works? The .Net GC is very well-documented indeed but I can't find a single shred of detail on the VBA GC, other than that vague mentions that it's a reference counter. I assume that it's pretty similar to the VB6 GC but can't find any information on that either.
Specifically, I'd be interested in knowing:
What triggers a GC
What algorithm it uses (is collection generational, for example?)
How (if at all) does it handle circular references?
Is there any way of monitoring its operation
This is more out of curiosity than any particular need to know, any insight at all much appreciated!

The following assumes that VBA is still using the same garbage collection mechanism used in VB6 (which it very probably does).
VB6 used a reference-counting GC. The GC is triggered deterministically when the last reference to a given object is set to Nothing. Setting local references to Nothing is unnecessary, this happens as they go out of scope.
Every object implements a COM interface that takes care of the reference count for that object. Each assignment of an object reference updates the reference counters of the involved references (i.e. the counter of old object that was previously referenced gets decremented, and the new object’s counter is incremented). An object is garbage collected when its reference counter reaches 0.
Objects in circular references are thus never collected during the lifetime of a VBA application. What’s more, VBA doesn’t offer a way to break circular references. In VB6, weak references could be implemented via WinAPI functions.

Related

VB.NET Track COM RCW error

I have a really big project that I can not easily strip down.
When the application is being closed, I get the error
"InvalidComObjectException: A COM object that has been disconnected from the RCW can not be used."
Details:
System.Runtime.InteropServices.InvalidComObjectException has occured.
HResult=-2146233049
Message=A COM object that has been disconnected from its RCW can not be used.
Source=mscorlib
StackTrace:
at System.StubHelpers.StubHelpers.StubRegisterRCW(Object pThis)
InnerException:
Unfortunately I can not see what COM object this is about.
Does anybody know how I can find that out? Unfortunately I can't read ASM to analyze the disassembly.
There are some steps you could try and since you haven't posted any code I will try to enumerate some of the most common key factors...
First, there is no order when disposing objects. If a close/dispose/finalize action is invoked, object C might be disposed before object A and if some object is still alive it might try to access an already disposed object.
Second, beware of events. It's very common to get errors about accessing a disposed object originated from an event call.
Third, do not dispose objects inside an event scope nor destructors. Create your own method to free your object(s).
Since you don't know which COM object is the culprit, I suggest you look for the ones that you do close, dispose, disconnect, etc...
You might want to read this blog in order to better understand how RCW's work and also to help you with your problem.
Edit:
After reading your comment, I felt I should add two possible causes:
If after removing Microsoft.VisualBasic runtime methods you've solved your problem, then I suspect that inside one, or more, of those methods you have incremented the count of a reference to one, or more, of your COM(s) and did not release properly.
Be sure to ReleaseComObject checking the reference count until it's 0 and then set it to nothing(VB) or null(C#) and let the Garbage Collector do the rest.
The other option is that one of those methods tried to access an already disposed reference to a COM object, resulting in a RCW error, since is no longer callable.
As a final comment, after releasing the COM object always set it to nothing or null, this will release the reference to the variable and you can always check it's availability anywhere in the code.

Is it necessary to call Marshal.ReleaseComObject in C#4 when doing COM?

I have VS2010 and added a reference to a COM library to my project and VS embedded a primary interop inside the project.
If I reference objects from the COM library and I want to dispose of them quickly without waiting for the GC, is it needed to call ReleaseComObject ?
Marshal.ReleaseComObject provides a way to immediately drop references to this COM object from everywhere it is being consumed within managed code (because it releases the underlying IUnknown from the RCW that is holding it).
As Hans notes, the generally right course is to simply null your object allow the CLR and GC to do the COM object destruction at the appropriate time.
However, in situations where you need immediate action (COM object holds expensive/scarce resources, or perhaps during shutdown where the sequencing is very complex), calling ReleaseComObject can be the right thing to do.
Martyn
There are two types of approaches in terms of cleaning memory.
Reference Counting Algorithms
Garbage Collection
COM components use reference counting algorithm to reclaim memory. Whenever you do not need a reference of a com component you can call it. But Usually my approach is creating a virtual stack and deleting references like the source code below in C#. .NET guarantees as long as RCW is alive COM components is alive. When RCW is garbage collected release method is invoked. It is not necessary to call release in your code. But it does not effect gc cycles.
void DoSth
{
{
RunTimeCallableWrapper area for the code
ReleaseComObject
}
}

Forcing Garbage Collection

Is there a way to force garbage collection in VBA/Excel 2000?
This question refers to the Macro language in Excel.
Not using VB .NET to manipulate Excel. So GC.collect() won't work
You cannot take advantage of garbage collection provided by the .NET Framework when using straight VBA. Perhaps this article by Eric Lippert will be helpful
You can't force GC in VBA, but it's good to set to Nothing the global variables.
The article mentioned by kd7 says it's useless to set to Nothing the local variables before they go out of scope, but doesn't talk about the global variables.
In VBA the global variables defined in a module remain alive through the whole Excel session, i.e. until the document containing the VBA module that defines them closed.
So don't put useless Set O = Nothing when O is local, but do it when it's global.
VBA/Excel does not have garbage collection, like old VB. Instead of GC, it uses reference counting. Memory is freed when you set a pointer to nothing (or when variable goes out of scope). Like in old VB it means that circular references are never freed.

Does assigning to Nothing cause Dispose to be invoked?

I recently saw some VB .NET code as follows:
Dim service = ...
Try
...
service.Close()
Finally
service = Nothing
End Try
Does assigning Nothing to service do anything? If it is a garbage collection issue, I am assuming that when "service" went out of scope the referenced object would be garbage collected and the dispose method called on the object.
It seems to me that assigning this variable Nothing can't really do anything, as there could be another reference to the object around so the reference counts haev to be checked anyways.
It only releases the reference, which may mean that the object is available for garbage collection (there could still be other variables referencing the same object). If the object implements IDisposable, you need to call Dispose explicitly, otherwise you may have a resource leak.
NO!
You're seeing old VB6 code, where assigning Nothing reduced the reference count on COM objects.
In most situations assigning null (Nothing) to a reference makes no difference to garbage collection what so ever.
The garbage collector doesn't care about scope, it only cares about usage. After the point in the code where the object is used the last time, the garbage collector knows that it can collect it because it won't be used any more.
Assigning null to the reference doesn't count as using the object, so the last point of usage is before that code. That means that when you clear the reference the garbage collector may already have collected the object.
(In debug mode though the usage of a variable is expanded to it's scope, so that the debugger can show the value of the variable throughout it's scope.)
Assinging NULL to a reference in .NET does not help to clean the object away. It might help the garbage collector to run a little quicker in some corner cases but that's not important at all. It does not call dispose, either (when dealing with a disposable)
I love to assign NULL anyways to explicitly state that I won't use that other object anymore. So it has much more to do with catching bugs (you'll get a nullreference exception instead of possibly calling into some other object - which might fail or even silently create some side effects.)
So assigning NULL after closing another object (File or whatever) is a "code cleanliness" thing that eases debugging, it's not a help to the garbage collector (except in some really strange corner cases - but when you need to care about that you WILL know more about the garbage collector than you ever wanted to know anyways ...)
As everybody has already said, setting to nothing does not force garbage collection, if you want to force GC then you would be far better to use the using ke word
Using objA As A = New A()
objA.DoSomething()
End Using
You still don't need to set to nothing as the End Using tells the Garbage collection that the object is no longer to be used
It's important to understand in .net (or Java) that a variable, field, or other storage location of class type Foo doesn't hold a Foo. It holds a reference to a Foo. Likewise, a List<Foo> doesn't hold Foos; it holds references to Foos. In many cases, a variable will be known by the programmer to hold the only extant reference to some particular Foo. Unfortunately, the compiler has no general means of knowing whether a storage location holds the only extant reference to an object, or whether it holds one of many.
The main rule about IDisposable is that objects which implements IDisposable should be told they are no longer need sometime between the moment they are in fact no longer needed, and the time that all references to them are abandoned. If an object hasn't been Disposed, and code is about to overwrite the only extant reference to it (either by storing null, or by storing a reference to something else), the object should have its Dispose method called. If there exist other reference to the object, and the holders of those references expect to keep using it, Dispose should not be called. Since the compiler can't tell which situation applies, it doesn't call Dispose but leaves that to the programmer (who hopefully has a better idea of whether or not to call it).

Is object clearing/array deallocation really necessary in VB6/VBA (Pros/Cons?)

A lot of what I have learned about VB I learned from using Static Code Analysis (Particularly Aivosto's Project Analyzer). And one one of things it checks for is whether or not you cleared all objects and arrays. I used to just do this blindly because PA said so. But now that I know a little bit more about the way VB releases resources, it seems to me that these things should be happening automatically. Is this a legacy feature from pre VB6, or is there a reason why you should explicitly set objects back to nothing and use Erase on arrays?
Matt Curland, author of Advanced Visual Basic 6, who knows more about Visual Basic than most of us ever will, thinks it is wasted effort. Consider this quote (p110) about DAO, the COM data access library that primarily targets the Access Database Engine:
another example of poor teardown code.
DAO has Close methods that must be
called in the correct order, and the
objects must be released in the
correct order as well (Recordset
before Database, for example). This
single poor object model behavior has
led to the misconception that VB leaks
memory unless you explicitly set all
the local variables to nothing at the
end of a function. This is a
completely false notion in a
well-designed object model. VB can
clear the variables faster at the End
Sub line than you can from code, and
it checks the variables even if you
explicitly release your references.
Any effort you make is duplicated.
The problem, as I understand it, has to do with the fact that VB6 (and its predecessors) has its roots in COM, and its reference-counting garbage collection system.
Imagine, for instance, that you declare a refernece to an object from a 3rd party library. That object has a COM reference count that is used both to keep it alive and to determine when it should be destroyed. It isn't destroyed when you set it to Nothing, but when the object's reference count reaches zero.
Now, not all COM components were written in Visual Basic. Some were written in C or C++. Structured exception handling didn't exist across all languages. So if an error occurred, the reference count on the object was not guaranteed to be properly reduced, and COM objects were known to hang around longer than they were intended to. This wasn't a problem with Visual Basic, per se. It was a COM problem. (And that, you might note, is why .NET doesn't use reference counting.)
That's why Visual Basic developers became obsessive about releasing object references prior to exiting routines. You simply don't know what a component you're allocating is creating under the hood. But when you release your reference to it, you're at least releasing your reference count to it. It became almost a religious mantra. Declare, use, release. It was the COM way of doing things.
Sure, Visual Basic might be better or faster at dereferencing variables I declared on the stack. But dammit, I want it to be OBVIOUS that those objects were released. A little assurance goes a long way when you're trying to track down a memory leak.
Have you read this Aivosto web page (from the creators of Project Analyzer)?
If you are using static variables,
it's important to reclaim the memory
they occupied when you don't need the
variables any more. With dynamic
variables memory isn't so much of a
problem, because they are destroyed
when the procedure ends.
In other words, you don't need to worry about clearing ordinary, non-static, local variables.
I always do it for good practice, you never know what an exception might do if you fall in one and your objects are not deallocated. You should relase them in finally statements and ensure they are not using any memory otherwise you may run into a memory leak.
I had an issue inside of a simple time off tracker system where the server kept on crashing randomly, it took weeks to determine it was a memory leak of an object that was supposed to self destruct on its own. My code was being thrown into an exception and never cleaned up after itself causing the server (the actual web site not the entire server) to go down.
Yes, set all objects to Nothing and clean up as much as you can. VB6 is notorious for having memory leaks when not cleaning up your stuff. Garbage collection was sub-par in VB6/VBA.