Variable declaration placement guidelines in VBScript - variables

Is there any rule for placement of variable declaration in VBScript, like if it should always be declared in the beginning? Or can I declare the variable while using it? Which one is more efficient?

Let's try with a simple code, with Option Explicit included so VBScript parser requests that all the variables used in the code are declared
Option Explicit
WScript.Echo TypeName( data )
WScript.Echo TypeName( MY_DATA )
Dim data : data = 10
Const MY_DATA = 10
WScript.Echo TypeName( data )
WScript.Echo TypeName( MY_DATA )
When executed it will ouptut
Empty
Integer
Integer
Integer
That is
The first access to data does not generate any error. Variable declaration (the Dim statement) is hoisted. If the variable is declared inside the same (or outer) scope where it will be used then there will not be any problem.
But the first output is Empty. Only the declaration is hoisted, not the value assignment that is not executed until the line containing it is reached.
That does not apply to constant declaration. Its value is replaced in code where it is used but the real declaration is delayed until the const line is reached (read here).
As long as the variables/constants can be reached (they are declared in the same or outer scope) it is irrelevant (to the VBScript parser/engine) where you place the declaration.
But, of course, you or others will have to maintain the code. Being able to put the variables anywhere doesn't mean you should do something like the previous code (please, don't). It is a lot easier to read/maintain the code if variable declaration is done before initialization/usage. The exact way of doing it just depends on coding style.

Related

What are the risks of declaring a variable in the middle of the code?

I usually see in almost all of VBA codes all variables are declared after e.g. Sub/Function name line
I know and I used variable declaration in the middle of some of my codes (Not inside a loop) and saw no problems.
I usually avoided that because I see most of VBA example codes have them declared right after the first line. I just want to know what are the risks from an expert/experienced VB programmer point of view.
There are no risks of declaring it in the middle.
The effect of declaring a variable in the middle is that it can only be used after that point and not before (which is scope).
The lifetime of the variable is different: the variable is created (allocated and initialized to its respective flavour of zero) when you enter the procedure, but you may not actually use it until you reach its scope (the point in the procedure where it's declared).
Declaring inside or outside a loop does not make a difference in VB6/A as they do not have block scope, unlike VB.NET.
So there is no performance difference between the two approaches (because all variables are created when you enter the procedure), but there is a difference in usage (you may not use a created variable before its declaration line). If you think that distinction is helpful in making sure you are not using a variable wrongly, declare your variables only where needed. Otherwise you are free to pick any of the two approaches, just apply it consistently (it's probably not a good idea to declare most of the variables in the beginning and then some in the middle).
Declare your variables, when you actually need them. When you have all declarations lumped at the top of the procedure, refactoring becomes much harder. And when you want to double check your declaration as you read your code (or, perhaps, someone else), searching it at the top may be again quite inconvenient, unless you procedure is short.
I would try to declare variables in a location that conveys useful information to the next programmer, over and above being functionally correct. This normally means: follow the scoping rules of the language.
By declaring all variables at the top you are making them available (in scope) for the entire procedure. That increases the work for a reader in the future, trying to understand how they will be used. Better to have them as local as possible.
I would not declare them in a loop since that actually would not have significance in VB6/VBA - but someone else might find confusing or misleading, or worst case it may cause subtle bugs.
Of course remember that this is not the only coding practice that we should be mindful of - if the procedure is so long that the location of the variable declarations is a big problem, that's a really good sign that the procedure should be broken up into smaller discrete logical blocks. The variable declarations would just be a symptom, not the main cause.
IMO there were many bad programming practices back in the 90s and earlier when VBA/VB6 were invented, but the industry has significantly learned & improved since then. So code from that era (or inspired by it) is often not a good example.
Declaring your variables up front, at the top of your sub/function makes it easy for others (and perhaps for you if you come by the code after, say a month) to read and understand what your code needs to calculate, and what placeholders/variables are required for the code to function.
You can of course declare variables anywhere (as long as you remember not to use a variable unless you have actually declared it first). That can work, and it has no effect whatsoever on the performance of your code (unless your logic includes an early Exit Sub or Exit Function. In this case, there will be a difference in performance depending on if your code does actually allocate memory for the variables or not).
It just isn't good practice to declare some variables at the top then do some work, then declare another set of variables mid-code. There are exceptions of course. When the variable you declared mid-code is for a temporary use, or something like that.
Sub CalculateAge()
Dim BirthYear As Integer
Dim CurrentYear As Integer
'Code to fetch current year
'Code to get BirthYear from user/or document
'Code to report result
End Sub
Compare that with the following:
Sub CalculateAge2()
Dim BirthYear As Integer
'Code to ask the user or fetch the birth year from the document
Dim CurrentYear As Integer
'Code to populate currentYear
'Code to do the calculation and report result
End Sub
In the first example, there is a clear separation from variables and logic. In the second, everything is mixed.
The first example is a lot easier to read and understand, especially if you use a good naming convention.
If you look at how classes are written or defined, you will see properties usually are first declared, then methods/logic below. This is the common practice used to write code.
PS: In other languages, you can declare and assign variables in the same line. in C# or VB.Net you could say something like:
int Age = CurrentYear - BirthYear; //C#
Dim Age As Integer = CurrentYear - BirthYear 'VB.Net
This is great if you use a lot of temporary variables, that you don't intend to declare ahead of time or maybe it would be more clear if declared mid-logic. But that's not possible in VBA. You need a separate line to declare a variable, and another to assign a value. You end up with a lot of Dim ___ As ___ statements. You might as well move the declaration part somewhere else to reduce distraction while reading the logic. Again, this works best if you use a good and consistent naming convention. If not, you end up in a worse situation like:
Dim w As Integer
Dim a As Integer
a = 42 'we don't know what this variable is for
'but we know its type from the previous line
Some_Lines_Of_code_And_Logic
' more code
' more code
w = 2 'we don't know what (w) is for, and we have to
'look up its declaration to get a hint
'which might be tedious

Extractng Variable Names from VBA Scripts

I want to Get List of Variables used in the Script i.e. VariableName13, strDLink, strZone.
A single file contains about 150+ events and each project contains about 700-900 files.
In VBA environment, I want to parse through each file, loop through each event and extract the Variable names declared or referenced by the Events.
I did find some material like Roslyn or TypeLib but unable to understand how to use them?
Can someone please share a proper approach to extract the variables?
Environment: VBA 6, SCADA HMI
Private Sub Rect13_Click()
Dim lResult As Long
Dim strDLink As String
Dim strZone As String
On Error GoTo ErrorHandler
lResult = OpenFuncUpdate
If lResult = SomeValue Then
'DoThis
ElseIf lResult = SomeOtherValue Then
strDLink = "FullPathLink"
FuncDisassemblePath strDLink, , , , , , , , , , , , strZone
If Len(strZone) > 0 And (InStr(VariableName13.CurrentValue, "%") = 0) Then
SubLoadIA strZone & "%" & VariableName13.CurrentValue, Me
Else
SubLoadIA VariableName13.CurrentValue, Me
End If
End If
Exit Sub
ErrorHandler:
SubHandleError
End Sub
Depending on how you define what a "variable" is, you can try to parse VBA code with VBA code and regular expressions.
If all your declarations are consistently made, and consistently declare a single variable, and variables are consistently declared (Option Explicit is in every module), then capturing Dim|Private|Public|Friend|Global {identifier} should be good-enough... but that makes a lot of "ifs"
Real-life projects have Dim statements that can declare a list of local or private variables. Or there's a ReDim statement somewhere that's actually declaring an array on-the-spot. Or they don't always specify Option Explicit and variables aren't always even declared at all. Or there's a line continuation in the middle of the statement that breaks the regular expression. Or, or, or... so many things can go wrong, parsing VBA code is a rabbit hole.
For example suppose you need to pick up and list undeclared variables. A regular expression can't tell its usage from a procedure or function call, because they're syntactically identical. Regular expressions are missing the context of the grammar - and it's by tokenizing (aka "lexing") the source code and then parsing the tokens using parser rules that we can be 100% certain of what we're looking at.
Fortunately this is a solved problem, and there's free, open-source VBIDE tooling available for this, and get you 100% correct results every time without writing a single line of code or worrying about what legal declarations might be left unaccounted for.
Rubberduck (I manage this OSS project and own its website) will correctly parse any legal VB6/vBA code (and if it doesn't, we're extremely interested in a repro!), and then you can simply click a "copy" button to instantly have every single declaration in the clipboard:
Ctrl+V /paste onto a worksheet (or a Word document, or in Notepad!) and then you can easily turn it into a filter-enabled table; in your case you'd want to filter the [Declaration Type] for "Variable":
Above, the exported declarations for a VBA project that has a Sheet1 module with a test procedure that uses (but doesn't declare) a variable named undeclared:
Sub test()
undeclared = 42
Debug.Print undeclared
End Sub
Here's the same table for the code you've provided:
Note that SubHandleError and other Sub and Function calls would parse as and resolve to a procedure/function in your project. Here they're being picked up as undeclared variables because I didn't parse anything other than the code you supplied, so these identifiers are undefined.

When are VBA Variables Instantiated

I'm hesitant to ask, but there's no documentation that I can find for VBA.
Relevant (but I don't think a dupe):
C++ When are global variables created?
In Java, should variables be declared at the top of a function, or as they're needed?
C++ Declare variables at top of function or in separate scopes?
and the most likely relevant When are a module's variables in VB.NET instantiated?
I also took a look at C# on programmers.SE.
I think I'm using the word "Instantiate" right, but please correct me if I'm wrong. Instantiating is when a variable is created and allocated the resources it requires? So in VBA I see two ways of doing this.
Everything at the top!
Public Sub ToTheTop()
Dim var1 As Long
Dim var2 As Long
Dim var3 As Long
var1 = 10
var2 = 20
var3 = var1 + var1
Debug.Print var3
End Sub
Or close to use
Public Sub HoldMeCloser()
Dim var1 As Long
var1 = 10
Dim var2 As Long
var2 = 20
Dim var3 As Long
var3 = var1 + var1
Debug.Print var3
End Sub
I like to put them closer to use so that it's easier to remember what they are, whereas others might want to get them all out of the way. That's personal preference.
But, I think I remember reading somewhere that the VBE goes through a sub/function and instantiates all the variables before going on to anything else. This would indicate that there's no right way to do this in VBA because the variable scopes in time don't change. Not the scope as in Private vs Public.
Whereas in other languages it seems that scope can change based on placement and therefor has a best practice.
I've been searching for this documentation for a while now, but whatever words I'm using aren't pointing me in the right direction, or the documentation doesn't exist.
According to the reference documentation,
When a procedure begins running, all variables are initialized. A numeric variable is initialized to zero, a variable-length string is initialized to a zero-length string (""), and a fixed-length string is filled with the character represented by the ASCII character code 0, or Chr(0). Variant variables are initialized to Empty. Each element of a user-defined type variable is initialized as if it were a separate variable.
When you declare an object variable, space is reserved in memory, but its value is set to Nothing until you assign an object reference to it using the Set statement.
The implication is that regardless of where the variable declaration is stated, the space/memory for it is allocation when the procedure is entered.
The variables, constants, and objects, are instantiated that way :
at module level they are instantiated when the application starts, whether they are declared public, private or static
at procedure level (sub/function) they are instantiated when the procedure is executed.
You have to understand that, although it does have a "compiler", vba is NOT a true compiled language. The compiler is a syntax checker that checks for errors in your code to not encounter them at runtime. In MS access the compiler produce something that is called p-code and which is a combination of compiled and interpreted code.
As a rule of thumb:
always use option explicit statement (configure your compiler for this)
always declare your variables at one place, on top of your module or sub/function, and avoid doing it in the middle of your code, for the sake of clarity only. This doesn't affect the performance in any way.
avoid using variant data type
Worth a read doc:
Understanding the Lifetime of Variables (official mSDN), Visual/Access Basic Is Both a Compiler and an Interpreter (official MS) and Declaring variables. You might also find interesting this answer I recently gave about the vba garbage collector

Why should I use the DIM statement in VBA or Excel?

So there is a question on what DIM is, but I can't find why I want to use it.
As far as I can tell, I see no difference between these three sets of code:
'Example 1
myVal = 2
'Example 2
DIM myVal as Integer
myVal = 2
'Example 3
DIM myVal = 2
If I omit DIM the code still runs, and after 2 or 3 nested loops I see no difference in the output when they are omitted. Having come from Python, I like to keep my code clean*.
So why should I need to declare variables with DIM? Apart from stylistic concerns, is there a technical reason to use DIM?
* also I'm lazy and out of the habit of declaring variables.
Any variable used without declaration is of type Variant. While variants can be useful in some circumstances, they should be avoided when not required, because they:
Are slower
Use more memory
Are more error prone, either through miss spelling or through assigning a value of the wrong data type
Using Dim makes the intentions of your code explicit and prevents common mistakes like a typo actually declaring a new variable. If you use Option Explicit On with your code (which I thoroughly recommend) Dim becomes mandatory.
Here's an example of failing to use Dim causing a (potentially bad) problem:
myVar = 100
' later on...
myVal = 10 'accidentally declare new variable instead of assign to myVar
Debug.Print myVar 'prints 100 when you were expecting 10
Whereas this code will save you from that mistake:
Option Explicit
Dim myVar as Integer
myVar = 100
' later on...
myVal = 10 ' error: Option Explicit means you *must* use Dim
More about Dim and Option Explicit here: http://msdn.microsoft.com/en-us/library/y9341s4f.aspx
Moderators, I'm making an effort, assuming you'll treat me with due respect in thefuture.
All local variables are stored on the stack as with all languages (and most parameters to functions). When a sub exits the stack is returned to how it was before the sub executed. So all memory is freed. Strings and objects are stored elsewhere in a object manager or string manager and the stack contains a pointer but vb looks after freeing it. Seting a vbstring (a bstr) to zero length frees all but two bytes. That's why we try to avoid global variables.
In scripting type programs, typeless programming has many advantages. Programs are short and use few variables so memory and speed don't matter - it will be fast enough. As programs get more complex it does matter. VB was designed for typeless programming as well as typed programming. For most excel macros, typeless programming is fine and is more readable. Vbscript only supports typeless programming (and you can paste it into vba/vb6).

VBA assigning new object to variable?

I'm trying create new object from a module class in VBA, and I have a small diffcult. Two line of assigning code, look like the same, but result is different.
I got a error message:
After that, I switch to use (1) instead of (2), error was fixed.
But I dont understand; Why do they have this difference?
Dim declares a variable, Set instantiates it.
So, it's a good practice to always have Dim before Set.
If you do not use Dim to declare the specific type of a variable you may subsequently change the variable to another type, for example after;
set aosh = new AOSHRatioQuery
You could mutate the variable to a string;
aosh = "A pint of milk"
As the sendAsyncRequest method expects a AOSHRatioQuery as its 2nd argument & the VBA compiler knows that it cannot guarantee that the aosh variable will actually contain an instance of that type, type safety is violated & the Type Mismatch error is raised to prevent sendAsyncRequest from receiving garbage it cannot interpret.
Explicitly typing with Dim aosh as new AOSHRatioQuery tells the compiler that aosh is guaranteed to always be AOSHRatioQuery instance or Nothing (attempting to assign it to another type will raise an error) so it can be passed safely.
In VBA, you have to declare variables using the Dim keyword, and then defining their data types with the As keyword. That's just how its syntax works. As a general form:
Dim <variableName> As <dataType>