Best practice when validating class properties in VBA? - vba

I haven't worked much with classes, so I think this is a beginner's question.
I have a class that has the following property:
Private pAvtalsslut As Date
''''''''''''''''''''''
' Avtalsslut property
''''''''''''''''''''''
Public Property Get Avtalsslut() As Date
Avtalsslut = pAvtalsslut
End Property
Public Property Let Avtalsslut(Value As Date)
pAvtalsslut = Value
End Property
When I set property values to objects of this class I use the following validation in my subs:
If IsDate(exportWks.Cells(r, lColumnAvtalsslut)) Then
avtal.Avtalsslut = exportWks.Cells(r, lColumnAvtalsslut)
End If
I do this because otherwise I would get an error when the cell I read from is empty.
When I get property values from objects of this class I use the following validation in my subs:
If avtal.Avtalsslut <> 0 Then
wUnderlag.Cells(row, 3) = avtal.Avtalsslut
End If
I do this because I don't want to write zeros where there are no dates. I want to leave cells blank in that case.
Now to my question. What are some best practices for these kinds of validations? Should I have themin my class or in my subs? If they should be in my class, how should that look?
(PS.Is validation the correct vocabulary for these kinds of checks?)

Generally speaking, it's best to let the class validate the data before getting or setting a property, and then raise an error if the data doesn't meet specifications. This way you don't have to repeat that validation code all over your code base anytime you use your class.
However, in this case, you're trying to avoid a type mismatch error. So, you would need to change the property type to Variant instead of Date and then you're going to surprise and confuse anyone who uses your class expecting to give/receive dates. This violates the principal of least surprise and makes the class useless if you need to use it for anything but reading/writing to a spreadhsheet.
So, neither. Let's go for option C.
Create a second class (say, MyClassReaderWriter) that has the job of acting as an intermediary between your existing class and the spreadsheet. This allows for two things.
You don't have to make any changes to your existing class, and it remains useful in other contexts.
It dries up your code by placing the logic into one specific place.

Related

Do I understand not using getters and setters correctly

After reading this piece by Yegor about not using getters and setters, it sounds like something that makes sense to me.
Please note this question is not about whether doing it is better/worst, only if I am implementing it correctly
I was wondering in the following two examples in VBA, if I understand the concept correctly, and if I am applying it correctly.
The standard way would be:
Private userName As String
Public Property Get Name() As String
Name = userName
End Property
Public Property Let Name(rData As String)
userName = rData
End Property
It looks to me his way would be something like this:
Private userName As String
Public Function returnName() As String
returnName = userName
End Function
Public Function giveNewName(newName As String) As String
userName = newName
End Function
From what I understand from the two examples above is that if I wanted to change the format of userName (lets say return it in all-caps), then I can do this with the second method without changing the name of the method that gives the name through - I can just let returnName point to a userNameCaps property. The rest of my code in my program can still stay the same and point to the method userName.
But if I want to do this with the first example, I can make a new property, but then have to change my code everywhere in the program as well to point to the new property... is that correct?
In other words, in the first example the API gets info from a property, and in the second example the API gets info from a method.
Your 2nd snippet is neither idiomatic nor equivalent. That article you link to, is about Java, a language which has no concept whatsoever of object properties - getFoo/setFoo is a mere convention in Java.
In VBA this:
Private userName As String
Public Property Get Name() As String
Name = userName
End Property
Public Property Let Name(rData As String)
userName = rData
End Property
Is ultimately equivalent to this:
Public UserName As String
Not convinced? Add such a public field to a class module, say, Class1. Then add a new class module and add this:
Implements Class1
The compiler will force you to implement a Property Get and a Property Let member, so that the Class1 interface contract can be fulfilled.
So why bother with properties then? Properties are a tool, to help with encapsulation.
Option Explicit
Private Type TSomething
Foo As Long
End Type
Private this As TSomething
Public Property Get Foo() As Long
Foo = this.Foo
End Property
Public Property Let Foo(ByVal value As Long)
If value <= 0 Then Err.Raise 5
this.Foo = value
End Property
Now if you try to assign Foo with a negative value, you'll get a runtime error: the property is encapsulating an internal state that only the class knows and is able to mutate: calling code doesn't see or know about the encapsulated value - all it knows is that Foo is a read/write property. The validation logic in the "setter" ensures the object is in a consistent state at all times.
If you want to break down a property into methods, then you need a Function for the getter, and assignment would be a Sub not a Function. In fact, Rubberduck would tell you that there's a problem with the return value of giveNewName being never assigned: that's a much worse code smell than "OMG you're using properties!".
Functions return a value. Subs/methods do something - in the case of an object/class, that something might imply mutating internal state.
But by avoiding Property Let just because some Java guy said getters & setters are evil, you're just making your VBA API more cluttered than it needs to be - because VBA understands properties, and Java does not. C# and VB.NET do however, so if anything the principles of these languages would be much more readily applicable to VBA than Java's, at least with regards to properties. See Property vs Method.
FWIW public member names in VB would be PascalCase by convention. camelCase public member names are a Java thing. Notice how everything in the standard libraries starts with a Capital first letter?
It seems to me that you've just given the property accessors new names. They are functionally identical.
I think the idea of not using getters/setters implies that you don't try to externally modify an object's state - because if you do, the object is not much more than a user-defined type, a simple collection of data. Objects/Classes should be defined by their behavior. The data they contain should only be there to enable/support that behavior.
That means you don't tell the object how it has to be or what data you want it to hold. You tell it what you want it to do or what is happening to it. The object itself then decides how to modify its state.
To me it seems your example class is a little too simple to work as an example. It's not clear what the intended purpose is: Currently you'd probably better off just using a variable UserName instead.
Have a look at this answer to a related question - I think it provides a good example.
Regarding your edit:
From what I understand from the two examples above is that if I wanted
to change the format of userName (lets say return it in all-caps),
then I can do this with the second method without changing the name of
the method that gives the name through - I can just let returnName
point to a userNameCaps property. The rest of my code in my program
can still stay the same and point to the method iserName.
But if I want to do this with the first example, I can make a new
property, but then have to change my code everywhere in the program as
well to point to the new property... is that correct?
Actually, what you're describing here, is possible in both approaches. You can have a property
Public Property Get Name() As String
' possibly more code here...
Name = UCase(UserName)
End Property
or an equivalent function
Public Function Name() As String
' possibly more code here...
Name = UCase(UserName)
End Function
As long as you only change the property/function body, no external code needs to be adapted. Keep the property's/function's signature (the first line, including the Public statement, its name, its type and the order and type of its parameters) unchanged and you should not need to change anything outside the class to accommodate.
The Java article is making some sort of philosophic design stance that is not limited to Java: The general advise is to severely limit any details on how a class is implemented to avoid making one's code harder to maintain. Putting such advice into VBA terms isn't irrelevant.
Microsoft popularized the idea of a Property that is in fact a method (or two) which masquerade as a field (i.e. any garden-variety variable). It is a neat-and-tidy way to package up a getter and setter together. Beyond that, really, behind the scenes it's still just a set of functions or subroutines that perform as accessors for your class.
Understand that VBA does not do classes, but it does do interfaces. That's what a "Class Module" is: An interface to an (anonymous) class. When you say Dim o As New MyClassModule, VBA calls some factory function which returns an instance of the class that goes with MyClassModule. From that point, o references the interface (which in turn is wired into the instance). As #Mathieu Guindon has demonstrated, Public UserName As String inside a class module really becomes a Property behind the scenes anyway. Why? Because a Class Module is an interface, and an interface is a set of (pointers to) functions and subroutines.
As for the philosophic design stance, the really big idea here is not to make too many promises. If UserName is a String, it must always remain a String. Furthermore, it must always be available - you cannot remove it from future versions of your class! UserName might not be the best example here (afterall, why wouldn't a String cover all needs? for what reason might UserName become superfluous?). But it does happen that what seemed like a good idea at the time the class was being made turns into a big goof. Imagine a Public TwiddlePuff As Integer (or instead getTwiddlePuff() As Integer and setTwiddlePuff(value As Integer)) only to find out (much later on!) that Integer isn't sufficient anymore, maybe it should have been Long. Or maybe a Double. If you try to change TwiddlePuff now, anything compiled back when it was Integer will likely break. So maybe people making new code will be fine, and maybe it's mostly the folks who still need to use some of the old code who are now stuck with a problem.
And what if TwiddlePuff turned out to be a really big design mistake, that it should not have been there in the first place? Well, removing it brings its own set of headaches. If TwiddlePuff was used at all elsewhere, that means some folks may have a big refactoring job on their hands. And that might not be the worst of it - if your code compiles to native binaries especially, that makes for a really big mess, since an interface is about a set of function pointers layed out and ordered in a very specific way.
Too reiterate, do not make too many promises. Think through on what you will share with others. Properties-getters-setters-accessors are okay, but must be used thoughtfully and sparingly. All of that above is important if what you are making is code that you are going to share with others, and others will take it and use it as part of a larger system of code, and it may be that these others intend to share their larger systems of code with yet even more people who will use that in their even larger systems of code.
That right there is probably why hiding implementation details to the greatest extent possible is regarded as fundamental to object oriented programming.

How can I create an "enum" type property value dynamically

I need to create a property in a class that will give the programmer a list of values to choose from. I have done this in the past using the enums type.
Public Enum FileType
Sales
SalesOldType
End Enum
Public Property ServiceID() As enFileType
Get
Return m_enFileType
End Get
Set(ByVal value As enenFileType)
m_enFileType = value
End Set
End Property
The problem is I would like to populate the list of values on init of the class based on SQL table values. From what I have read it is not possible to create the enums on the fly since they are basically constants.
Is there a way I can accomplish my goal possibly using list or dictionary types?
OR any other method that may work.
I don't know if this will answer your question, but its just my opinion on the matter. I like enums, mostly because they are convenient for me, the programmer. This is just because when I am writing code, using and enum over a constant value gives me not only auto-complete when typing, but also the the compile time error checking that makes sure I can only give valid enum values. However, enums just don't work for run-time defined values, since, like you say, there are compile time defined constants.
Typically, when I use flexible values that are load from an SQL Table like in your example, I'll just use string values. So I would just store Sales and SalesOldType in the table and use them for the value of FileType. I usually use strings and not integers, just because strings are human readable if I'm looking at data tables while debugging something.
Now, you can do a bit of a hybrid, allowing the values to be stored and come from the table, but defining commonly used values as constants in code, sorta like this:
Public Class FileTypeConstants
public const Sales = "Sales"
public const SalesOldType = "SalesOldType"
End Class
That way you can make sure when coding with common values, a small string typo in one spot doesn't cause a bug in your program.
Also, as a side note, I write code for and application that is deployed internally to our company via click-once deployment, so for seldom added values, I will still use an enum because its extremely easy to add a value and push out an update. So the question of using and enum versus database values can be one of how easy it is to get updates to your users. If you seldom update, database values are probably best, if you update often and the updates are not done by users, then enums can still work just as well.
Hope some of that helps you!
If the values aren't going to change after the code is compiled, then it sounds like the best option is to simply auto-generate the code. For instance, you could write a simple application that does something like this:
Public Shared Sub Main()
Dim builder As New StringBuilder()
builder.AppendLine("' Auto-generated code. Don't touch!! Any changes will be automatically overwritten.")
builder.AppendLine("Public Enum FileType")
For Each pair As KeyValuePair(Of String, Integer) In GetFileTypesFromDb()
builder.AppendLine(String.Format(" {0} = {1}", pair.Key, pair.Value))
End For
builder.AppendLine("End Enum")
File.WriteAllText("FileTypes.vb", builder.ToString())
End Sub
Public Function GetFileTypesFromDb() As Dictionary(Of String, Integer)
'...
End Function
Then, you could add that application as a pre-build step in your project so that it automatically runs each time you compile your main application.

Newbie question: how do I create a class to hold data in Visual Basic Studio?

I'm really sorry. This must seem like an incredibly stupid question, but unless I ask I'll never figure it out. I'm trying to write a program to read in a csv file in Visual Basic (tried and gave up on C#) and I asked a friend of mine who is much better at programming than I am. He said I should create a class to hold the data that I read in.
The problem is, I've never created a class before, not in VB, Java, or anything. I know all the terms associated with classes, I understand at a high level how classes work no problem. But I suck at the actual details of making one.
So here's what I did:
Public Class TsvData
Property fullDataSet() As Array
Get
Return ?????
End Get
Set(ByVal value As Array)
End Set
End Property
End Class
I got as far as the question marks and I'm stuck.
The class is going to hold a lot of data, so I made it an array. That could be a bad move. I don't know. All i know is that it can't be a String and it certainly can't be an Integer or a Float.
As for the Getter and Setter, the reason I put the question marks in is because I want to return the whole array there. The class will eventually have other properties which are basically permutations of the same data, but this is the full set that I will use when I want to save it out or something. Now I want to return the whole Array, but typing "Return fullDataSet()" doesn't seem like a good idea. I mean, the name of the property is "fullDataSet()." It will just make some kind of loop. But there is no other data to return.
Should I Dim yet another array inside the property, which already is an array, and return that instead?
Instead of writing your own class, you could get yourself familiar with the pre-defined class System.Data.DataTable and then use that for holding CSV data.
In the last few years that I've been programming, I've never actually used a multi-dimensional array, and I'd advise you not to use them, either. There's usually ways of achieving the same with a better data structure. For example, consider creating a class (let's call it CsvRecord) that holds only one record; that is, only one line from the CSV file. Then use any of the standard collection types from the System.Collections.Generic namespace (e.g. List(Of CsvRecord)) to hold the entire data (ie. all lines) in the CSV file. This effectively reduces the problem to, "How do I read in one line of CSV data?"
If you want to take suggestion #2 even further, do as cHao says and don't simply lay out the information you've read as a CsvRecord; instead, create an object that reflects the actual content. For example, if your CSV file contains product–price information, call your CSV record class ProductInfo or something more fitting.
If, however, you want to go on with your current approach, you will need a backing field for the property, as demonstrated by Philipp's answer. Your property then becomes a "façade" that only delegates to this backing field. This is not absolutely necessary: You could simply make the backing field Public and let the user of your class access it directly, though that is not considered a good practice.
Ideally, you ought to have a class representing the specific data you want to read in. Setting an entire array at once is asking for trouble; some programs that read {C,T}SV files will freak out if all rows don't have the same number of columns, which is exceedingly easy to do if you can set the data to be an array of arbitrary length.
If you're trying to represent arbitrary data, frankly, you'd do just as well to use a List(Of String). If it's meant to be a table, you could instead read in the first line and make it a list as above (let's call it "headers"), and then make each row a Dictionary(Of String, String). (Let's call each row "row", and the collection (a list of these dictionary objects) "rows".) Just read in the line, split it like you did the first, and say something like row(headers(column number)) = value for each column, and then stuff it into 'rows'.
Or, you could use the data classes (System.Data.DataTable and System.Data.DataSet would do wonders here).
Usually you use a private member to store the actual data:
Public Class TsvData
Private _fullDataSet As String()
Public Property FullDataSet() As String()
Get
Return _fullDataSet
End Get
Set(ByVal value As String())
_fullDataSet = value
End Set
End Property
Note that this is an instance of bad design since it couples a concept to a concrete representation and allows the clients of the class to modify the internals without any error checking. Returning a ReadOnlyCollection or some dedicated container would be better.

VB Classes Best Practice - give all properties values?

Sorry if this is a bit random, but is it good practice to give all fields of a class a value when the class is instanciated? I'm just wondering if its better practice to have a constuctor that takes no parameters and gives all the fields default values, or whether fields that have values should be assigned and others left alone until required?
I hope that makes sense,
Becky
I don't know if it makes a performance difference or not, but any fields for which you have explicit default values I personally prefer to assign them in the declarations, as so:
Public Class MyClass
Private pIsDirty As Boolean = False
Private pDated as Date = Now()
End Class
Keep in mind most "simple" types like boolean, integer, etc. auto-default and don't NEED to be initialized, but I show that here as example and sometimes for clarity you want it anyway. Additionally since any classes I write are all for internal use (we don't sell any code objects for public use) I can be assured to the consumer of my classes. So I generally just write a minimal constructor (if a non-default one is needed) that only takes the primary fields, and spin up any additional values with the new With syntax in VB as so:
Dim myObj = New SomeClass() With { .Prop1 = "value", .Prop2 = Now() }
Your class' constructor should accept enough parameters to be in an usable state.
You can get the same functionality you seem to be looking for by using Optional Parameters in your constructor.
That way you can set by name just the properties that you have to, and leave the rest with default values until you need to change them.
Sub Notify(ByVal Company As String, Optional ByVal Office As String = "QJZ")
If Office = "QJZ" Then
Debug.WriteLine("Office not supplied -- notifying Headquarters")
Office = "Headquarters"
End If
' Code to notify headquarters or specified office.
End Sub
Remember that optional parameters must be after all non optional parameters.
Ideal practice is to have an object in a usable state as soon as the constructor returns. This reduces errors whereby a partially 'ready' object is inadvertently used.
It depends. How do you intend to use the class? What is the purpose of the class? (i.e. is it an entity class for database modeling or some other class)
I always make my entity classes with nullable properties that are all null when the class is instanced with a constructor that takes no parameter. Then when I call .Load I know all properties reflect the database.
Adding a constructor that calls .Load and assigns all of the properties known values from the database would be a feasible route also.
If you are not referring to an entity modeling class then it really depends on your usage of the class.
My personal preference is to assign all properties a known value (from constructor parameters) and therefore the class is in a known - neutral state.

Best Practice on local use of Private Field x Property

When inside a class you have a private fiels and expose that field on a public property, which one should I use from inside the class?
Below you is an example on what I am trying to find out.
Should manioulate the Private Field _Counter or the Property Counter?
Public Class Test
Private _Counter As Integer
Public Property Counter() As Integer
Get
Return _Counter
End Get
Set(ByVal value As Integer)
_Counter = value
End Set
End Property
Private Sub Dosomething()
'What is the best practice?
'Direct access to private field or property?
'On SET
_Counter += 1
'OR
Me.Counter += 1
'On Get
Console.WriteLine(_Counter)
Console.WriteLine(Me.Counter)
End Sub
End Class
Thanks in advance for the help.
Edu
IMO you should be using the Property accessor whenever possible. This is because you don't have to worry about any internal logic that might be available when you have an a property.
A good example of where this happens is in the code behind in a Linq DataContext.
check this out...
[Column(Storage="_ReviewType", DbType="TinyInt NOT NULL")]
public byte ReviewType
{
get
{
return this._ReviewType;
}
set
{
if ((this._ReviewType != value))
{
this.OnReviewTypeChanging(value);
this.SendPropertyChanging();
this._ReviewType = value;
this.SendPropertyChanged("ReviewType");
this.OnReviewTypeChanged();
}
}
}
Notice all that logic in the 'setter'?
This is why it's important to start getting into the practice of calling your Properties instead of fields, IMO.
Thank you all for the answers and suggestions.
After considering all the suggestions here plus other researches it is my impression that for this situation on Private Field versus Assessor it is more of a personal choice. So basically the most important is that no matter what you choose be consistent.
That said; my personal rule is leaning towards this:
Access your private fields directly.
If accessing accessors use the keyword ME. to improve readability
Use the accessor only if it implements vital logic logic that also applies to private access. This way you know that if you are using the accessor it is because there is "something else to it"
Avoid using Protected Fields. Derived classes should always use the accessor, never direct access to the field.
Let me know what you think.
SideNote:
After this I think we are missing a new scope for the class level fields. A keyword like “Restricted” where this field could only be accessed from its getter/setter. This way you always access directly the private fields, but if you need to make sure certain field can only be accessed by its accessor that you change the Private to Restricted. (how about "Restricted , RestrictedRead and RestrictedWrite"?)
In my opinion, using a public accessor internally is over-encapsulation: it blurs the code. With such an approach, otherwise simple operations invoke accessors that may contain more complex logic, so it's harder to analyze the code of the operations.
In my programming experience, I've rarely had a situation when it would help much. Instead, I prefer to access fields directly, and only if it's really needed, to abstract the access by creating a private accessor, which can be used by both the public accessor and other functions. The rationale is that if you need to attach some special logic in the public accessor, chances are that the logic may not be the same for internal access.
Note also that most modern IDEs (like Eclipse) allow to see immediately all references to a private field, and to refactor the code to use a function instead of a direct access.
I always use the property accessors, because the I am safe in case I add logic in the getter or setter in the future, knowing for sure that no code bypasses it.
I prefer to use the property whenever possible. This gives you the flexibility in the future to modify what the property returns/sets without having to go through and find all the locations that were using the private variable.
Use the private field because you are not doing something in specific in the setter.
I would also recommend to remove the property-setter, this way you force the state of the counter to be set by the given method DoSomething()
Depending on the situation, it may be preferable to allow the direct modification of a field on a class only privately, and or through some method which associates semantics with the modification. This way it becomes easier to reason about this class and that particular value, since you can be certain that its modified only in a certain way. Moreover, at some point, an action such as incrementing and int may have additional required consequences at which point it makes more sense to expose access to it through methods.
If you are worried about the performance overhead of calling property accessors when they just go directly to the field, don't. Most compilers will inline this sort of thing, giving you effectively the same performance. At least, you're pretty unlikely to need the extra nanoseconds of time you might gain by going directly to the field.
It's better to stick with property accessors because a) you can be very consistent in all of your code which makes it more maintainble and b) you get the benefits pointed out by others here.
Also, I don't usually add the Me. (or this.) keywords, unless there's a scope problem (which I try to avoid by choosing my identifiers carefully). I don't get confused by this because my functions and subs are never so long that I'm not sure whether I am working with a local (stack-based) variable or a member of the class. When they get too long to tell easily, I refactor.
Original poster is EXACTLY correct.
1) Access your private fields directly.
Makes refactoring easier.
2) If accessing accessors use the keyword ME. to improve readability
explicitly listing scope requires less thinking by reader
3) Use the accessor only if it implements vital logic logic that also applies to private access. This way you know that if you are using the accessor it is because there is “something else to it”
this is the only reason to violate rule #1.
4) Avoid using Protected Fields. Derived classes should always use the accessor, never direct access to the field.