What Getters and Setters should and shouldn't do [duplicate] - properties

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Convention question: When do you use a Getter/Setter function rather than using a Property
I've run into a lot of differing opinions on Getters and Setters lately, so I figured I should make it into it's own question.
A previous question of mine received an immediate comment (later deleted) that stated setters shouldn't have any side effects, and a SetProperty method would be a better choice.
Indeed, this seems to be Microsoft's opinion as well. However, their properties often raise events, such as Resized when a form's Width or Height property is set. OwenP also states "you shouldn't let a property throw exceptions, properties shouldn't have side effects, order shouldn't matter, and properties should return relatively quickly."
Yet Michael Stum states that exceptions should be thrown while validating data within a setter. If your setter doesn't throw an exception, how could you effectively validate data, as so many of the answers to this question suggest?
What about when you need to raise an event, like nearly all of Microsoft's Control's do? Aren't you then at the mercy of whomever subscribed to your event? If their handler performs a massive amount of information, or throws an error itself, what happens to your setter?
Finally, what about lazy loading within the getter? This too could violate the previous guidelines.
What is acceptable to place in a getter or setter, and what should be kept in only accessor methods?
Edit:
From another article in the MSDN:
The get and set methods are generally no different from other methods. They can perform any program logic, throw exceptions, be overridden, and be declared with any modifiers allowed by the programming language. Note, however, that properties can also be static. If a property is static, there are limitations on what the get and set methods can do. See your programming language reference for details.

My view:
If a setter or getter is expected to be expensive, don't make it a property, make it a method.
If setting a property triggers events due to changes, this is fine. How else would you allow listeners to be notified of changes? However, you may want to offer a BeginInit/EndInit pair to suppress events until all changes are made. Normally, it is the responsibility of the event handler to return promptly, but if you really can't trust it to do so, then you may wish to signal the event in another thread.
If setting a property throws exceptions on invalid values, it's also fine. This is a reasonable way to signal the problem when the value is completely wrong. In other cases, you set a bunch of properties and then call a method that uses them to do something, such as make a connection. This would allow holding off validation and error-handling until the properties are used, so the properties would not need to throw anything.
Accessing a property may have side-effects so long as they aren't unexpected and don't matter. This means a JIT instantiation in a getter is fine. Likewise, setting a dirty flag for the instance whenever a change is made is just fine, as it setting a related property, such as a different format for the same value.
If it does something as opposed to just accessing a value, it should be a method. Method are verbs, so creating a connection would be done by the OpenConnection() method, not a Connection property. A Connection property would be used to retrieve the connection in use, or to bind the instance to another connection.
edit - added 5, changed 2 and 3

I agree with the idea that getters/settings shouldn't have side effects, but I would say that they shouldn't have non-obvious side effects.
As far as throwing exceptions, if you are setting a property to an invalid value (in a very basic sense), then validation exceptions are fine. However, if the setter is running whole series of complicated business rule validation, or trying to go off and update other objects, or any other thing that may cause an exception, then that is bad. But this problem is not really an issue with the exception itself, but rather that the setter is going off and secretly performing a lot of functionlity that the caller would not (or should not) expect.
The same with events. If a setter is throwing an event saying that "this property changed", then it's OK, because that's an obvious side effect. But if it's firing off some other custom event so cause some hidden chuck of code to execute in another part of a system, it's bad.
This is the same reason that I avoid lazy-loading in getters. Indeed, they can make things easier a lot of the time, but they can make things a more confusing some of the time, because there always ends up being convoluted logic around exactly when you want the child objects loaded. It's usually just one more line of code to explicitly load the child objects when you are populating the parent object, and it can avoid a lot of confusion about the object state. But this aspect can get very subjective, and a lot of it depends on the situation.

I've always found the conservative approach to be best, when working in C# anyway. Because properties are syntactically the same as fields, they should work like fields: no exceptions, no validation, no funny business. (Indeed, most of my properties start out as simple fields, and don't become properties until absolutely necessary.) The idea is that if you see something that looks like it's getting or setting a field set, then it IS something like getting or setting a field, in terms of functionality (there is no exception thrown), overall efficiency (setting variables doesn't trigger a cascade of delegate calls, for example) and effect on program's state (setting a variable sets that variable, and doesn't call lots of delegates that could do just about anything).
Sensible things for a property set to do include setting a flag to indicate that there's been a change:
set {
if(this.value!=value) {
this.changed=true;
this.value=value;
}
}
Perhaps actually set a value on another object, e.g.:
set { this.otherObject.value=value; }
Maybe disentangle the input a bit, to simplify the class's internal code:
set {
this.isValid=(value&Flags.IsValid)!=0;
this.setting=value&Flags.SettingMask;
}
(Of course, in these latter two cases, the get function might well do the opposite.)
If anything more complicated needs to happen, in particular calling delegates, or performing validation, or throwing exceptions, then my view is that a function is better. (Quite often, my fields turn into properties with get and set, and then end up as a get property and a set function.) Similarly for the getters; if you're returning a reference to something, it's no problem, but if you're creating a whole new large object and filling it in each time the property is read -- not so hot.

Related

Anti-if purposes: How to check nulls?

I recently heard of the anti-if campaign and the efforts of some OOP gurus to write code without ifs, but using polymorphism instead. I just don't get how that should work, I mean, how it should ALWAYS work.
I already use polymorphism (didn't know about anti-if campaign), so, I was curious about "bad" and "dangerous" ifs and I went to see my code (Java/Swift/Objective-C) to see where I use if most, and it looks like these are the cases:
Check of null values. This is the most common situation where I ever use ifs. If a value could possibly be null, I have to manage it in a correct way. To use it, instead I have to check that it's not null. I don't see how polymorphism could compensate this without ifs.
Check for right values. I'll do an example here: Let's suppose that I have a login/signup application. I want to check that user did actually write a password, or that it's longer than 5 characters. How could it possibly be done without if/switches? Again, it's not about the type but about the value.
(optional) check errors. Optional because it's similar to point 2 about right values. If I get either a value or an error (passed as params) in a block/closure, how can I handle the error object if I just can't check if it's null or isn't?
If you know more about this campaign, please answer in scope of that. I'm asking this to understand their purposes and how effectively it could be done what they say.
So, I know not using ifs at all may not be the smartest idea ever, but just asking if and how it could effectively be done in an OOP program.
You'll never completely get rid of ifs, but you can minimize them.
Regarding null value checks, a method that would otherwise return a null value can return a Null Object instead, an object that doesn't represent a real value but implements some of the same behavior as a real value. Its callers can just call methods on the Null Object instead of checking to see if it's null. There is probably still an if inside the method, but there don't need to be any in the callers.
Regarding correct value checks, the best path is to prevent an object from being instantiated with incorrect attributes. All users of the object can then be confident that they don't have to inspect the object's attributes to use it. Similarly, if an object can have an attribute that is valid or invalid, it can hide that from its users by providing higher-level methods that do the right thing for the current attribute value. Again, there is still a if inside the object, but there don't need to be any in the callers.
Regarding error checks, there are a couple of strategies that are better than returning a possibly null error value that the caller might forget to check. One is raising an exception. Another is to return an object of a type that can hold either a result or an error and provides type-safe, if-free ways to operate on either result when appropriate, like Java's Optional or Haskell's Maybe.
Note also that case statements are just concatenated ifs (in fact I'd have written the code on the campaign's home page with a switch rather than if/else if), and there are also patterns which replace case with polymorphism, such as the Strategy pattern.
This is a great question and is something that's asked at every OO bootcamp I've been a part of. To begin with, we need to understand why code with a lot of ifs is 'bad' or 'dangerous':
they increase the cyclomatic complexity of the code, making it hard to follow/understand.
they make tests more complicated to write. Ensuring that you test each branch flow in the method under test becomes increasingly more difficult with each conditional and makes test setup cumbersome.
they could be a sign that your code has not been broken into small enough methods
they could be a sign that your methods have not been encapsulated well
However, there is one important thing to remember - ifs cannot(and should not) be eliminated from the code completely. But, we can generally abstract them away using techniques like polymorphism, extracting small behaviours, and encapsulating these behaviours into the appropriate classes.
Now that we know some of the reasons why we should avoid ifs, let's tackle your questions:
Checking for null values: The Null object pattern helps you eliminate null checks from your code(polymorphism FTW). Instead of returning null, you return a Special Case NullObject representation of the expected object. This NullObject has the same interfaces as your actual object and you can safely call any of the object's methods without worrying about a null pointer exception being thrown.
Checking for correctness of values: There are a lot of ways to do this. For example, you could create a separate ValidationRule class for each of your validations and then chain calls to them together when you want to validate your object. Notice that the ifs still remain, but they get abstracted away into the individual ValidationRule implementations. Look up the Command pattern and the Chain Of Responsibility pattern for ideas.
It's better to use if to check the null instead of raising an exception. Also in common cases checking for null helps us to prevent operations with non-initialized variables.
Using switch plus SOLID. Other thinks inherited from this.

Manipulating Objects in Methods instead of returning new Objects?

Let’s say I have a method that populates a list with some kind of objects. What are the advantages and disadvantages of following method designs?
void populate (ArrayList<String> list, other parameters ...)
ArrayList<String> populate(other parameters ...)
Which one I should prefer?
This looks like a general issue about method design but I couldn't find a satisfying answer on google, probably for not using the right keywords.
The second one seems more functional and thread safe to me. I'd prefer it in most cases. (Like every rule, there are exceptions.)
The owner of the populate method could return an immutable List (why ArrayList?).
It's also thread safe if there is no state modified in the populate method. Only passed in parameters are used, and these can also be immutable.
Other than what #duffymo mentioned, the second one is easier to understand, thus use: it is obvious what its input and output is.
Advantages to the in-out parameter:
You don't have to create as many objects. In languages like C or C++, where allocation and deallocation can be expensive, that can be a plus. In Java/C#, not so much -- GC makes allocation cheap and deallocation all but invisible, so creating objects isn't as big a deal. (You still shouldn't create them willy-nilly, but if you need one, the overhead isn't as bad as in some manual-allocation languages.)
You get to specify the type of the list. Potential plus if you need to pass that array to some other code you don't control later.
Disadvantages:
Readability issues.
In almost all languages that support function arguments, the first case is assumed to mean "do something with the entries in this list". Modifying args violates the Priciple of Least Astonishment. The second is assumed to mean "give me a list of stuff", which is what you're after.
Every time you say "ArrayList", or even "List", you take away a bit of flexibility. You add some overhead to your API. What if i don't want to create an ArrayList before calling your method? I shouldn't have to, if the method's whole purpose in life is to return me some entries. That's the API's job.
Encapsulation issues:
The method being passed a list to fill can't assume anything about that list (even that it's a list at all; it could be null).
The method passing the list can't guarantee anything about what the method does with it. If it's working correctly, sure, the API docs can say "this method won't destroy existing entries". But considering the chance of bugs, that may not be worth trusting. At least if the method returns its own list, the caller doesn't have to worry about what was in it before. And it doesn't have to worry about a bug from a thousand miles away corrupting data it should never have affected.
Thread safety issues.
The list could be locked by another thread, meaning if we try and lock on it now it could potentially lock up the app.
Or, if not locked, it could still be modified by another thread, in which case we're no less screwed. Unless you're going to write extra code to handle concurrent-modification exceptions everywhere.
Returning a new list means every call to the method can have its own list. No thread can mess with another thread's return value, unless the class is very badly designed.
Side point: Being able to specify the type of the list often leads to dependencies on the type of the list. Notice how you're passing ArrayLists around everywhere. You're painting yourself into corners by saying "This is an ArrayList" when you don't need to, but when you're passing it to a dozen methods, that's a dozen methods you'll have to change. (Not entirely related, but only slightly tangential. You could change the types to List rather than ArrayList and get rid of this. But the more you're passing that list around, the more places you'll need to change.)
Short version: Unless you have a damn good reason, use the first syntax only if you're using the existing contents of the list in your method. IE: if you're modifying it, or doing something with the existing values. If you intend to return a list of entries, then return a List of entries.
The second method is the preferred way for many reasons.
primarily because the function signature is more clear and shows what its intentions are.
It is actually recommended that you NEVER change the value of a parameter that is passed in to a function unless you explicitly mark it as an "out" parameter.
it will also be easier to use in expressions
and it will be easier to change in the future. including taking it to a more functional approach (for threading, etc.) if you would like to

In what cases should public fields be used instead of properties? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Public Data members vs Getters, Setters
In what cases should public fields be used, instead of properties or getter and setter methods (where there is no support for properties)? Where exactly is their use recommended, and why, or, if it is not, why are they still allowed as a language feature? After all, they break the Object-Oriented principle of encapsulation where getters and setters are allowed and encouraged.
If you have a constant that needs to be public, you might as well make it a public field instead of creating a getter property for it.
Apart from that, I don't see a need, as far as good OOP principles are concerned.
They are there and allowed because sometimes you need the flexibility.
That's hard to tell, but in my opinion public fields are only valid when using structs.
struct Simple
{
public int Position;
public bool Exists;
public double LastValue;
};
But different people have different thoughts about:
http://kristofverbiest.blogspot.com/2007/02/public-fields-and-properties-are-not.html
http://blogs.msdn.com/b/ericgu/archive/2007/02/01/properties-vs-public-fields-redux.aspx
http://www.markhneedham.com/blog/2009/02/04/c-public-fields-vs-automatic-properties/
If your compiler does not optimize getter and setter invocations, the access to your properties might be more expensive than reading and writing fields (call stack). That might be relevant if you perform many, many invocations.
But, to be honest, I know no language where this is true. At least in both .NET and Java this is optimized well.
From a design point of view I know no case where using fields is recommended...
Cheers
Matthias
Let's first look at the question why we need accessors (getters/setters)? You need them to be able to override the behaviour when assigning a new value/reading a value. You might want to add caching or return a calculated value instead of a property.
Your question can now be formed as do I always want this behaviour? I can think of cases where this is not useful at all: structures (what were structs in C). Passing a parameter object or a class wrapping multiple values to be inserted into a Collection are cases where one actually does not need accessors: The object is merely a container for variables.
There is one single reason(*) why to use get instead of public field: lazy evaluation. I.e. the value you want may be stored in a database, or may be long to compute, and don't want your program to initialize it at startup, but only when needed.
There is one single reason(*) why to use set instead of public field: other fields modifications. I.e. you change the value of other fields when you the value of the target field changes.
Forcing to use get and set on every field is in contradiction with the YAGNI principle.
If you want to expose the value of a field from an object, then expose it! It is completely pointless to create an object with four independent fields and mandating that all of them uses get/set or properties access.
*: Other reasons such as possible data type change are pointless. In fact, wherever you use a = o.get_value() instead of a = o.value, if you change the type returned by get_value() you have to change at every use, just as if you would have changed the type of value.
The main reason is nothing to do with OOP encapsulation (though people often say it is), and everything to do with versioning.
Indeed from the OOP position one could argue that fields are better than "blind" properties, as a lack of encapsulation is clearer than something that pretends to encapsulation and then blows it away. If encapsulation is important, then it should be good to see when it isn't there.
A property called Foo will not be treated the same from the outside as a public field called Foo. In some languages this is explicit (the language doesn't directly support properties, so you've got a getFoo and a setFoo) and in some it is implicit (C# and VB.NET directly support properties, but they are not binary-compatible with fields and code compiled to use a field will break if it's changed to a property, and vice-versa).
If your Foo just does a "blind" set and write of an underlying field, then there is currently no encapsulation advantage to this over exposing the field.
However, if there is a later requirement to take advantage of encapsulation to prevent invalid values (you should always prevent invalid values, but maybe you didn't realise some where invalid when you first wrote the class, or maybe "valid" has changed with a scope change), to wrap memoised evaluation, to trigger other changes in the object, to trigger an on-change event, to prevent expensive needless equivalent sets, and so on, then you can't make that change without breaking running code.
If the class is internal to the component in question, this isn't a concern, and I'd say use fields if fields read sensibly under the general YAGNI principle. However, YAGNI doesn't play quite so well across component boundaries (if I did need my component to work today, I certainly am probably going to need that it works tomorrow after you've changed your component that mine depends on), so it can make sense to pre-emptively use properties.

Passing object references needlessly through a middleman

I often find myself needing reference to an object that is several objects away, or so it seems. The options I see are passing a reference through a middle-man or just making something available statically. I understand the danger of global scope, but passing a reference through an object that does nothing with it feels ridiculous. I'm okay with a little bit passing around, I suppose. I suspect there's a line to be drawn somewhere.
Does anyone have insight on where to draw this line?
Or a good way to deal with the problem of distributing references amongst dependent objects?
Use the Law of Demeter (with moderation and good taste, not dogmatically). If you're coding a.b.c.d.e, something IS wrong -- you've nailed forevermore the implementation of a to have a b which has a c which... EEP!-) One or at the most two dots is the maximum you should be using. But the alternative is NOT to plump things into globals (and ensure thread-unsafe, buggy, hard-to-maintain code!), it is to have each object "surface" those characteristics it is designed to maintain as part of its interface to clients going forward, instead of just letting poor clients go through such undending chains of nested refs!
This smells of an abstraction that may need some improvement. You seem to be violating the Law of Demeter.
In some cases a global isn't too bad.
Consider, you're probably programming against an operating system's API. That's full of globals, you can probably access a file or the registry, write to the console. Look up a window handle. You can do loads of stuff to access state that is global across the whole computer, or even across the internet... and you don't have to pass a single reference to your class to access it. All this stuff is global if you access the OS's API.
So, when you consider the number of global things that often exist, a global in your own program probably isn't as bad as many people try and make out and scream about.
However, if you want to have very nice OO code that is all unit testable, I suppose you should be writing wrapper classes around any access to globals whether they come from the OS, or are declared yourself to encapsulate them. This means you class that uses this global state can get references to the wrappers, and they could be replaced with fakes.
Hmm, anyway. I'm not quite sure what advice I'm trying to give here, other than say, structuring code is all a balance! And, how to do it for your particular problem depends on your preferences, preferences of people who will use the code, how you're feeling on the day on the academic to pragmatic scale, how big the code base is, how safety critical the system is and how far off the deadline for completion is.
I believe your question is revealing something about your classes. Maybe the responsibilities could be improved ? Maybe moving some code would solve problems ?
Tell, don't ask.
That's how it was explained to me. There is a natural tendency to call classes to obtain some data. Taken too far, asking too much, typically leads to heavy "getter sequences". But there is another way. I must admit it is not easy to find, but improves gradually in a specific code and in the coder's habits.
Class A wants to perform a calculation, and asks B's data. Sometimes, it is appropriate that A tells B to do the job, possibly passing some parameters. This could replace B's "getName()", used by A to check the validity of the name, by an "isValid()" method on B.
"Asking" has been replaced by "telling" (calling a method that executes the computation).
For me, this is the question I ask myself when I find too many getter calls. Gradually, the methods encounter their place in the correct object, and everything gets a bit simpler, I have less getters and less call to them. I have less code, and it provides more semantic, a better alignment with the functional requirement.
Move the data around
There are other cases where I move some data. For example, if a field moves two objects up, the length of the "getter chain" is reduced by two.
I believe nobody can find the correct model at first.
I first think about it (using hand-written diagrams is quick and a big help), then code it, then think again facing the real thing... Then I code the rest, and any smells I feel in the code, I think again...
Split and merge objects
If a method on A needs data from C, with B as a middle man, I can try if A and C would have some in common. Possibly, A or a part of A could become C (possible splitting of A, merging of A and C) ...
However, there are cases where I keep the getters of course.
But it's less likely a long chain will be created.
A long chain will probably get broken by one of the techniques above.
I have three patterns for this:
Pass the necessary reference to the object's constructor -- the reference can then be stored as a data member of the object, and doesn't need to be passed again; this implies that the object's factory has the necessary reference. For example, when I'm creating a DOM, I pass the element name to the DOM node when I construct the DOM node.
Let things remember their parent, and get references to properties via their parent; this implies that the parent or ancestor has the necessary property. For example, when I'm creating a DOM, there are various things which are stored as properties of the top-level DomDocument ancestor, and its child nodes can access those properties via the reference which each one has to its parent.
Put all the different things which are passed around as references into a single class, and then pass around just that one class instance as the only thing that's passed around. For example, there are many properties required to render a DOM (e.g. the GDI graphics handle, the viewport coordinates, callback events, etc.) ... I put all of these things into a single 'Context' instance which is passed as the only parameter to the methods of the DOM nodes to be rendered, and each method can get whichever properties it needs out of that context parameter.

Data verifications in Getter/Setter or elsewhere?

I'm wondering if it's a good idea to make verifications in getters and setters, or elsewhere in the code.
This might surprise you be when it comes to optimizations and speeding up the code, I think you should not make verifications in getters and setters, but in the code where you're updating your files or database. Am I wrong?
Well, one of the reasons why classes usually contain private members with public getters/setters is exactly because they can verify data.
If you have a Number than can be between 1 and 100, i would definitely put something in the setter that validates that and then maybe throw an exception that is being caught by the code. The reason is simple: If you don't do it in the setter, you have to remember that 1 to 100 limitation every time you set it, which leads to duplicated code or when you forget it, it leads to an invalid state.
As for performance, i'm with Knuth here:
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."
#Terrapin, re:
If all you have is a bunch of [simple
public set/get] properties ... they
might as well be fields
Properties have other advantages over fields. They're a more explicit contract, they're serialized, they can be debugged later, they're a nice place for extension through inheritance. The clunkier syntax is an accidental complexity -- .net 3.5 for example overcomes this.
A common (and flawed) practice is to start with public fields, and turn them into properties later, on an 'as needed' basis. This breaks your contract with anyone who consumes your class, so it's best to start with properties.
It depends.
Generally, code should fail fast. If the value can be set by multiple points in the code and you validate only on after retrieving the value, the bug appears to be in the code that does the update. If the setters validate the input, you know what code is trying to set invalid values.
From the perspective of having the most maintainable code, I think you should do as much validation as you can in the setter of a property. This way you won't be caching or otherwise dealing with invalid data.
After all, this is what properties are meant for. If all you have is a bunch of properties like...
public string Name
{
get
{
return _name;
}
set
{
_name = value;
}
}
... they might as well be fields
Validation should be captured separately from getters or setters in a validation method. That way if the validation needs to be reused across multiple components, it is available.
When the setter is called, such a validation service should be utilized to sanitize input into the object. That way you know all information stored in an object is valid at all times.
You don't need any kind of validation for the getter, because information on the object is already trusted to be valid.
Don't save your validation until a database update!! It is better to fail fast.
I like to implement IDataErrorInfo and put my validation logic in its Error and this[columnName] properties. That way if you want to check programmatically whether there's an error you can simply test either of those properties in code, or you can hand the validation off to the data binding in Web Forms, Windows Forms or WPF.
WPF's "ValidatesOnDataError" Binding property makes this particularly easy.
I try to never let my objects enter an invalid state, so setters definitely would have validation as well as any methods that change state. This way, I never have to worry that the object I'm dealing with is invalid. If you keep your methods as validation boundaries, then you never have to worry about validation frameworks and IsValid() method calls sprinkled all over the place.
You might wanna check out Domain Driven Design, by Eric Evans. DDD has this notion of a Specification:
... explicit predicate-like VALUE
OBJECTS for specialized purposes. A
SPECIFICATION is a predicate that
determines if an object does or does
not satisfy some criteria.
I think failing fast is one thing, the other is where to keep the logic for validation. The domain is the right place to keep the logic and I think a Specification Object or a validate method on your Domain objects would be a good place.