Difference between Long and Object data type in VBA - vba

In VBA, the Long and Object data type are both 4-bytes, which is the size of a memory address. Does this mean that, technically, the Object data type doesn't do anything that a Long couldn't do? If yes, then is it safe to say that the Object data type exists simply to make it easier for the programmer to distinguish between the purpose of the variable?
This question came up as I was considering Win32 API function declarations. They are often times declared as Long, and, unless I am mistaken, their return value is simply a memory address. Seems like defining these functions as Object would have been more appropriate, then.
Am I totally off? Thanks in advance.

Based on VBA/MSDN help:
Long (long integer) variables are stored as signed 32-bit (4-byte)
numbers ranging in value from -2,147,483,648 to 2,147,483,647.
and the other definition:
Object variables are stored as 32-bit (4-byte) addresses that refer to
objects. Using the Set statement, a variable declared as an Object can
have any object reference assigned to it.
From practical point of view they are both different and used in different situation. Which are essential: Long >> refers to numbers and Object >> refers to object.
Look into the following VBA code (for Excel) where I added comments which is allowed and which is not:
Sub test_variables()
Dim A As Object
Dim B As Long
'both below are not allowed, throwing exceptions
'A = 1000
'Set B = ActiveSheet
'both are appropriate
Set A = ActiveSheet
B = 1000
End Sub
Finally, in terms of API it's better to stay with original declaration and not manipulate with that to avoid any risk on unexpected behaviour of API functions.

Related

What VBA variable type to use when reading values from a cell in Excel?

According to this answer one should always use Variant when assigning values in a cell to a variable in the code. Is this correct? I seem to recall reading elsewhere that using Variant indiscriminately is not a good practice.
You can read a cell value into any type you want, VBA will (try to) implicitly convert it to that type for you.
There are dozens of questions on this site involving run-time errors raised from reading cell values into a specific data type - perhaps you've seen this error message before?
Type mismatch
That's the error you get when you try to read a cell containing an error value (e.g. #REF!) into anything other than a Variant.
So if you read a cell value into, say, a Double, everything will work fine as long as you're reading something that VBA can coerce into that data type. The problem is that, well, data is never 100% clean, worksheets do break down, users delete columns and break formulas, lookups fail and the person that wrote the formula didn't bother wrapping it with IFERROR, etc.
That's why you read cell values into a Variant.
That doesn't mean you work with a Variant.
Dim cellValue As Variant
cellValue = someRange.Value
If IsError(cellValue) Then Exit Sub 'bail out before we blow up
Dim workingValue As String
workingValue = CStr(cellValue)
By assigning to another data type, you effectively cast the Variant to that more specific type - here a String. And because you like explicit type conversions, you use VBA's conversion functions to make the conversion explicit - here CStr.
Now, in real code, you probably wouldn't even bother reading it into a Variant - you can use IsError to test the cell value:
If IsError(someRange.Value) Then Exit Sub 'bail out before we blow up
Dim cellValue As String
cellValue = someRange.Value ' or cellValue = CStr(someRange.Value)
The flipside here is that you're accessing the cell twice. Whether or not that's better that reading it into a Variant is for you to decide; performance-wise, it's usually best to avoid accessing ranges as much as possible though.
The value you get from a cell (which is a Range) is a Variant according to the documentation:
Range.Value Property (Excel)
Returns or sets a Variant value that represents the value of the specified range.
Since a Variant can represent different data types, you could loose information if you would assign a cell's value to -- for instance -- a variable of type String.
The mere fact that there is data type information in a Variant already means you lose that type of information. If for instance the original type was numeric and you store it in a String variable, there is no way to know from that string value what the original data type was. You could also lose precision (on Date milliseconds for instance).
Furthermore, a Variant type value cannot always be cast to the data type of your variable, and so you could get a Type mismatch error. In practice this often happens with the Error sub data type.
Only when you know beforehand what the data type is of a certain cell's value, it would be good to define your receiving variable in that data type.
Not strictly answering your question, but thought I'd add this for reference anyway.
With native Excel functions you can usually provide either a range object or a value directly to a function. For example, you can either write =AVERAGE(A1,A2,A3) or =AVERAGE(10,20,30). If you want to do something similar for any user defined functions, you will need to check the type of object passed to your function:
Function test(input As Variant)
Dim var As Variant
If TypeName(input) = "Range" Then
var = input.Value
Else
var = input
End If
You may also want to check for other objects if your function can accept them, but doing this will make your functions behave more like users expect them to.

Why must Set be used to hold objects?

While working on my PowerPoint macro, I noticed the following:
To obtain the current, active slide:
Dim currSlide As Slide
Set currSlide = Application.ActiveWindow.View.Slide
To obtain a newly created Textbox:
Dim textbox As Shape
Set textbox = currSlide.Shapes.AddTextbox(...)
I'm new to VBA, having worked with Java, C++ & C#. Why must Set be used above? Why does using Slide & Shape instead generate errors? In what way is VBA different in how variable declaration work in this respect?
This is taken from Byte Comb - Values and References in VBA
In VBA, the difference between value types and reference types is made explicit by requiring the keyword Set when assigning a reference. In addition, you will often see assigning a reference referred to as “binding”.
Byte Comb goes into great detail about the inter working of the VBA. I highly recommend that any serious about programming using the VBA read through it.
In layman's terms: Set is used so that the compiler knows that the programming wants to have a variable reference an Object type. In this way, the compiler will know to throw an error when you are trying to assign a value to an object.
Another reason is that objects can have default values.
In this example I have a variable named value that is of a Variant type. Variants can be of any type in the VBA. Notice how the compiler knows that to assign the value of the Range when value = Range("A1") and to set a reference to the Range when the keyword Set is used in Set value = Range("A1").

Returning the Default Property

I fill A1 and A2 as follows:
I then run:
Sub WhatIsGoingOn()
Dim r As Range, sh As Worksheet
Set r = Range(Cells(1, 1))
Set sh = Sheets(Cells(2, 1))
End Sub
I expected that in both cases, VBA would use the default property of Cells (the Value) property to Set each variable. However I get a runtime error 13 on the last line of code!
In order to avoid errors, I must use:
Sub WhatIsGoingOn2()
Dim r As Range, sh As Worksheet
Set r = Range(Cells(1, 1))
Set sh = Sheets(Cells(2, 1).Value)
End Sub
What is going on here ??
The difference is in how the input to their default properties is handled by the implementation of the Range and Sheets objects.
The default property of both the Range and the Sheets object takes a parameter of type Variant. You can pass anything to it, so no type coercion will be necessary. In your first example you pass a Range object to both.
How the default properties handle the input is up to themselves. Apparently the property of the Range tries to retrieve the default value of the passed parameter, in your example an address as String. The Sheets object doesn't seem to be so forgiving and raises an error because you neither passed a number nor a String.
Inconsistency is one of the strengths of VBA...
Btw., passing CStr(Cells(2, 1)) would also work, because you explicitly cast to String before passing as a parameter.
Perhaps Leviathan's comment that "Inconsistency is one of the strengths of VBA..." may ring true, but there are some contextual details that his answer neglects (and is technically incorrect on some subtle points). He is correct that for the given code that all of the parameters are variants, but the statement that "no type coercion will be necessary" can be misleading and perhaps just wrong in many cases. Even if many objects and methods are programmed to handle multiple types and default values, this very question reveals that it is a mistake to avoid purposely ensuring (or coercing) the correct data type. Avoiding default properties altogether (by always typing out the full reference) can avoid many headaches.
A significant difference between the lines of code for this particular question is this: Range is a property that takes parameters, while Sheets is also a property but has no parameters. Range and Sheets are NOT objects in this context, even though they are properties which do return Range and Sheets objects, respectively. They are properties of an (automatic global) object defined for the particular module or Excel workbook instance. This detail is not trivial for understanding what the code is actually doing.
The Obect Browser in the VBA window reveals the following metadata for the two properties:
Property Range(Cell1, [Cell2]) As Range
Property Sheets As Sheets
For Range(Cells(1, 1)), the argument Cells(1,1) is passed to the parameter Cell1. Cells itself is a property of the Excel.Global hidden object and it returns a Range object, so that Cells(rowindex, colindex) is calling a hidden default property of the Range class equivalent to Cells._Default(rowindex, colindex). The return type of the property _Default() is not declared, so technically it could return any type in a variant, but inspection shows that it returns a Range object. Apparently passing a Range object to its default property will attempt to take the default value and if it is a valid range value, like a string with a range expression, then it will execute without error.
The default property for the Sheets class is the parameterized hidden _Default(Index) method. Thus, Sheets(Cells(2, 1)) is equivalent to Sheets._Default(Cells(2, 1)). More importantly, it means that Sheets._Default(Cells(2, 1)) is passing a Range object as an index value, but documentation says that it expects an integer or string value. We already mentioned that the index parameter is variant... and when passing an object to a variant, it always passes the actual object and never its default property. So we know that Sheets.Item obtains a Range object in that call. Here is were Levithan was correct in that Sheets.Item can decide what to do with it. It is likely that it could have been smart enough to get the single string value and continue without error. Other collection objects (with a default Item(index) property) in MS Office objects do not seem to exhibit this same "pickiness", so it appears that Sheets._Default() (and perhaps Sheets.Item()) is being rather strict on validating its arguments. But this is just a particular design issue with this method only... not necessarily an overall issue with VBA.
What can be difficult is determining exactly what the source objects of the properties are. Inside the ThisWorkbook module, Me.Sheets reveals that Sheets is a property of the particular Workbook for the module. But Me.Range is not valid in the Workbook module, but right-clicking on the Range property (without the Me qualifier) and choosing "Definition" results in the message "Cannot jump to Range because it is hidden". However, once in the Object Browser, righ-clicking within the browser one can choose "Show Hidden Members" which will then allow navigating to the hidden Global objects and other hidden members.
Why the inconsistency and the hidden properties? In an attempt to make the current instance of Excel and all of its components accessible in a "natural" way, Excel (and all Office applications) implement these various hidden properties to avoid the "complexity" of having to repeatedly discover and type out full references. For instance, the automatic global Application object also has both Range and Sheets properties. Actual documentation for Application.Sheets, for example, says "Using this property without an object qualifier is equivalent to using ActiveWorkbook.Sheets". Even that documentation fails to say is that ActiveWorkbook is in turn a property of the global Excel Application object'.

When are VBA Variables Instantiated

I'm hesitant to ask, but there's no documentation that I can find for VBA.
Relevant (but I don't think a dupe):
C++ When are global variables created?
In Java, should variables be declared at the top of a function, or as they're needed?
C++ Declare variables at top of function or in separate scopes?
and the most likely relevant When are a module's variables in VB.NET instantiated?
I also took a look at C# on programmers.SE.
I think I'm using the word "Instantiate" right, but please correct me if I'm wrong. Instantiating is when a variable is created and allocated the resources it requires? So in VBA I see two ways of doing this.
Everything at the top!
Public Sub ToTheTop()
Dim var1 As Long
Dim var2 As Long
Dim var3 As Long
var1 = 10
var2 = 20
var3 = var1 + var1
Debug.Print var3
End Sub
Or close to use
Public Sub HoldMeCloser()
Dim var1 As Long
var1 = 10
Dim var2 As Long
var2 = 20
Dim var3 As Long
var3 = var1 + var1
Debug.Print var3
End Sub
I like to put them closer to use so that it's easier to remember what they are, whereas others might want to get them all out of the way. That's personal preference.
But, I think I remember reading somewhere that the VBE goes through a sub/function and instantiates all the variables before going on to anything else. This would indicate that there's no right way to do this in VBA because the variable scopes in time don't change. Not the scope as in Private vs Public.
Whereas in other languages it seems that scope can change based on placement and therefor has a best practice.
I've been searching for this documentation for a while now, but whatever words I'm using aren't pointing me in the right direction, or the documentation doesn't exist.
According to the reference documentation,
When a procedure begins running, all variables are initialized. A numeric variable is initialized to zero, a variable-length string is initialized to a zero-length string (""), and a fixed-length string is filled with the character represented by the ASCII character code 0, or Chr(0). Variant variables are initialized to Empty. Each element of a user-defined type variable is initialized as if it were a separate variable.
When you declare an object variable, space is reserved in memory, but its value is set to Nothing until you assign an object reference to it using the Set statement.
The implication is that regardless of where the variable declaration is stated, the space/memory for it is allocation when the procedure is entered.
The variables, constants, and objects, are instantiated that way :
at module level they are instantiated when the application starts, whether they are declared public, private or static
at procedure level (sub/function) they are instantiated when the procedure is executed.
You have to understand that, although it does have a "compiler", vba is NOT a true compiled language. The compiler is a syntax checker that checks for errors in your code to not encounter them at runtime. In MS access the compiler produce something that is called p-code and which is a combination of compiled and interpreted code.
As a rule of thumb:
always use option explicit statement (configure your compiler for this)
always declare your variables at one place, on top of your module or sub/function, and avoid doing it in the middle of your code, for the sake of clarity only. This doesn't affect the performance in any way.
avoid using variant data type
Worth a read doc:
Understanding the Lifetime of Variables (official mSDN), Visual/Access Basic Is Both a Compiler and an Interpreter (official MS) and Declaring variables. You might also find interesting this answer I recently gave about the vba garbage collector

Why should I use the DIM statement in VBA or Excel?

So there is a question on what DIM is, but I can't find why I want to use it.
As far as I can tell, I see no difference between these three sets of code:
'Example 1
myVal = 2
'Example 2
DIM myVal as Integer
myVal = 2
'Example 3
DIM myVal = 2
If I omit DIM the code still runs, and after 2 or 3 nested loops I see no difference in the output when they are omitted. Having come from Python, I like to keep my code clean*.
So why should I need to declare variables with DIM? Apart from stylistic concerns, is there a technical reason to use DIM?
* also I'm lazy and out of the habit of declaring variables.
Any variable used without declaration is of type Variant. While variants can be useful in some circumstances, they should be avoided when not required, because they:
Are slower
Use more memory
Are more error prone, either through miss spelling or through assigning a value of the wrong data type
Using Dim makes the intentions of your code explicit and prevents common mistakes like a typo actually declaring a new variable. If you use Option Explicit On with your code (which I thoroughly recommend) Dim becomes mandatory.
Here's an example of failing to use Dim causing a (potentially bad) problem:
myVar = 100
' later on...
myVal = 10 'accidentally declare new variable instead of assign to myVar
Debug.Print myVar 'prints 100 when you were expecting 10
Whereas this code will save you from that mistake:
Option Explicit
Dim myVar as Integer
myVar = 100
' later on...
myVal = 10 ' error: Option Explicit means you *must* use Dim
More about Dim and Option Explicit here: http://msdn.microsoft.com/en-us/library/y9341s4f.aspx
Moderators, I'm making an effort, assuming you'll treat me with due respect in thefuture.
All local variables are stored on the stack as with all languages (and most parameters to functions). When a sub exits the stack is returned to how it was before the sub executed. So all memory is freed. Strings and objects are stored elsewhere in a object manager or string manager and the stack contains a pointer but vb looks after freeing it. Seting a vbstring (a bstr) to zero length frees all but two bytes. That's why we try to avoid global variables.
In scripting type programs, typeless programming has many advantages. Programs are short and use few variables so memory and speed don't matter - it will be fast enough. As programs get more complex it does matter. VB was designed for typeless programming as well as typed programming. For most excel macros, typeless programming is fine and is more readable. Vbscript only supports typeless programming (and you can paste it into vba/vb6).