OOP confusion in classes - oop

I am from a C# background and have been doing programming for quite some time now. But only recently i started giving some thoughts on how i program. Apparently, my OOP is very bad.
I have a few questions maybe someone can help me out. They are basic but i want to confirm.
1- In C#, we can declare class properties like
private int _test;
and there setter getters like
public int Test {get; set;}
Now, lets say i have to use this property inside the class. Which one will i use ? the private one or the public one ? or they both are the same ?
2- Lets say that i have to implement a class that does XML Parsing. There can be different things that we can use as input for the class like "FILE PATH". Should i make this a class PROPERTY or should i just pass it as an argument to a public function in the class ? Which approach is better. Check the following
I can create a class property and use like this
public string FilePath {get; set;}
public int Parse()
{
var document = XDocument.Load(this.FilePath);
.........//Remaining code
}
Or
I can pass the filepath as a parameter
public int Parse(string filePath)
On what basis should i make a decision that i should make a property or i should pass something as argument ?
I know the solutions of these questions but i want to know the correct approach. If you can recommend some video lectures or books that will be nice also.

Fields vs Properties
Seems like you've got a few terms confused.
private int _test;
This is an instance field (also called member).
This field will allow direct access to the value from inside the class.
Note that I said "inside the class". Because it is private, it is not accessible from outside the class. This is important to preserve encapsulation, a cornerstone of OOP. Encapsulation basically tells us that instance members can't be accessed directly outside the class.
For this reason we make the member private and provide methods that "set" and "get" the variable (at least: in Java this is the way). These methods are exposed to the outside world and force whoever is using your class to go trough your methods instead of accessing your variable directly.
It should be noted that you also want to use your methods/properties when you're inside the current class. Each time you don't, you risk bypassing validation rules. Play it safe and always use the methods instead of the backing field.
The netto result from this is that you can force your logic to be applied to changes (set) or retrieval (get). The best example is validation: by forcing people to use your method, your validation logic will be applied before (possibly) setting a field to a new value.
public int Test {get; set;}
This is an automatically implemented property. A property is crudely spoken an easier way of using get/set methods.
Behind the scenes, your code translates to
private int _somevariableyoudontknow;
public void setTest(int t){
this._somevariableyoudontknow = t;
}
public int getTest(){
return this._somevariableyoudontknow;
}
So it is really very much alike to getters and setters. What's so nice about properties is that you can define on one line the things you'd do in 7 lines, while still maintaining all the possibilities from explicit getters and setters.
Where is my validation logic, you ask?
In order to add validation logic, you have to create a custom implemented property.
The syntax looks like this:
private int _iChoseThisName;
public int Test {
get {
return _iChoseThisName;
}
set {
if(value > 5) { return _iChoseThisName; }
throw new ArgumentException("Value must be over 5!");
}
}
Basically all we did was provide an implementation for your get and set. Notice the value keyword!
Properties can be used as such:
var result = SomeClass.Test; // returns the value from the 'Test' property
SomeClass.Test = 10; // sets the value of the 'Test' property
Last small note: just because you have a property named Test, does not mean the backing variable is named test or _test. The compiler will generate a variablename for you that serves as the backing field in a manner that you will never have duplication.
XML Parsing
If you want your second answer answered, you're going to have to show how your current architecture looks.
It shouldn't be necessary though: it makes most sense to pass it as a parameter with your constructor. You should just create a new XmlParser (random name) object for each file you want to parse. Once you're parsing, you don't want to change the file location.
If you do want this: create a method that does the parsing and let it take the filename as a parameter, that way you still keep it in one call.
You don't want to create a property for the simple reason that you might forget to both set the property and call the parse method.

There are really two questions wrapped in your first question.
1) Should I use getters and setters (Accessors and Mutators) to access a member variable.
The answer depends on whether the implementation of the variable is likely to change. In some cases, the interface type (the type returned by the getter, and set by the setter) needs to be kept consistent but the underlying mechanism for storing the data may change. For instance, the type of the property may be a String but in fact the data is stored in a portion of a much larger String and the getter extracts that portion of the String and returns it to the user.
2) What visibility should I give a property?
Visibility is entirely dependent on use. If the property needs to be accessible to other classes or to classes that inherit from the base class then the property needs to be public or protected.
I never expose implementation to external concerns. Which is to say I always put a getter and setter on public and protected data because it helps me ensure that I will keep the interface the same even if the underlying implementation changes. Another common issue with external changes is that I want a chance to intercept an outside user's attempt to modify a property, maybe to prevent it, but more likely to keep the objects state in a good or safe state. This is especially important for cached values that may be exposed as properties. Think of a property that sums the contents of an array of values. You don't want to recalculate the value every time it is referenced so you need to be certain that the setter for the elements in the array tells the object that the sum needs to be recalculated. This way you keep the calculation to a minimum.
I think the second question is: When do I make a value that I could pass in to a constructor public?
It depends on what the value is used for. I generally think that there are two distinct types of variables passed in to constructors. Those that assist in the creation of the object (your XML file path is a good example of this) and those that are passed in because the object is going to be responsible for their management. An example of this is in collections which you can often initialize the collection with an array.
I follow these guidelines.
If the value passed in can be changed without damaging the state of the object then it can be made into a property and publicly visible.
If changing the value passed in will damage the state of the object or redefine its identity then it should be left to the constructor to initialize the state and not be accesible again through property methods.
A lot of these terms are confusing because of the many different paradigms and languages in OO Design. The best place to learn about good practices in OO Design is to start with a good book on Patterns. While the so-called Gang of Four Book http://en.wikipedia.org/wiki/Design_Patterns was the standard for many years, there have since been many better books written.
Here are a couple resources on Design Patterns:
http://sourcemaking.com/design_patterns
http://www.oodesign.com/
And a couple on C# specific.
http://msdn.microsoft.com/en-us/magazine/cc301852.aspx
http://www.codeproject.com/Articles/572738/Building-an-application-using-design-patterns-and

I can possibly answer your first question. You asked "I have to use this property inside the class." That sounds to me like you need to use your private variable. The public method which you provided I believe will only do two things: Allow a client to set one of your private variables, or to allow a client to "see" (get) the private variable. But if you want to "use this property inside the class", the private variable is the one that should be your focus while working with the data within the class. Happy holidays :)

The following is my personal opinion based on my personal experience in various programming languages. I do not think that best practices are necessarily static for all projects.
When to use getters, when to use private instance variables directly
it depends.
You probably know that, but let's talk about why we usually want getters and setters instead of public instance variables: it allows us to aquire the full power of OOP.
While an instance variable is just some dump piece of memory (the amount of dumbness surely depends on the language you're working in), a getter is not bound to a specific memory location. The getter allows childs in the OOP hirarchy to override the behaviour of the "instance variable" without being bound to it. Thus, if you have an interface with various implementations, some may use ab instance variable, while others may use IO to fetch data from the network, calculate it from other values, etc.
Thus, getters do not necessarily return the instance variable (in some languages this is more complicated, such as c++ with the virtual keyword, but I'll try to be language-independent here).
Why is that related to the inner class behaviour? If you have a class with a non-final getter, the getter and the inner variable may return different values. Thus, if you need to be sure it is the inner value, use it directly. If you, however, rely on the "real" value, always use the getter.
If the getter is final or the language enforces the getter to be equal (and this case is way more common than the first case), I personally prefer accessing the private field directly; this makes code easy to read (imho) and does not yield any performance penalty (does not apply to all languages).
When to use parameters, when to use instance variables/properties
use parameters whereever possible.
Never use instance variables or properties as parameters. A method should be as self-contained as possible. In the example you stated, the parameterized version is way better imo.
Intance variables (with getters or not) are properties of the instance. As they are part of the instance, they should be logically bound to it.
Have a look at your example. If you hear the word XMLParser, what do you think about it? Do you think that a parser can only parse a single file it is bound to? Or do you think that a parser can parse any files? I tend to the last one (additionally, using an instance variable would additionally kill thread-safety).
Another example: You wish to create an XMLArchiver, taking multiple xml documents into a single archive. When implementing, you'd have the filename as a parameter of the constructor maybe opening an outputstream towards the file and storing a reference to it as an instance variable. Then, you'd call archiver.add(stuff-to-add) multiple times. As you see, the file (thus, the filename) is naturally bound to the XMLArchiver instance, not to the method adding files to it.

Related

Is there a good reason to use a public property / field?

One of the important parts of object-oriented programming is encapsulation, but public properties / fields tend to break this encapsulation. Under what circumstances does a public property or field actually make sense?
Note: I only use the term 'property' or 'field' because terminology varies between languages. In general, I mean a variable that belongs to an object that can be accessed and set from outside the object.
Yes, there are sometimes good reasons. Information hiding is usually desirable. But there are some occasional exceptions.
For example, public fields are reasonable and useful for:
A C++ pimpl - a struct/class holding the private implementation of another class. Its fields may be declared public syntatically, but are typically accessible only within one source file, by the class holding the pimpl.
Constant fields. For example, Joshua Bloch writes in Effective Java: "Classes are permitted to expose constants via public static final fields."
Structs used for communication between C and C++.
Types which represent only data, whose representation is unlikely to change. For example, javax.vecmath.Point3d, which represents an {x,y,z} coordinate.
Short answer: never.
Actually, if you use an object for simply storing data, but the object itself does no logic, and you never mean to derive from this object, then it is OK to have public fields. Sometimes I do things like this in C++:
struct A {
int a;
float b;
string c;
A():a(0),b(0.0) {}
A(int a_, float b_, string c_):a(a_),b(b_),c(c_) {}
};
But other than having initializing constructors, it is nothing more than a C struct. If your class does anything more than this, than you should never use public (or even protected) fields.
As for properties, it depends on what language you use. For example, in Delphi, the main purpose of properties is to provide public interfaces to fields, and can provide getters/setters to them, while still working syntactically like a variable.
Is there a good reason to use a public
property / field?
No.
Public members are always dangerous. You may not need any control now, but once you expose them, you lose any possibility of having control later. If you have gettes/setters right away you have room for adding control later.
Ps:
Depending on the language you use, properties and fields may mean different things.
C# properties are actually a way to both achieve encapsulation and at the same time not being very verbose.
There is a bad reason: by directly accessing the datum you avoid pushing a method call onto the stack, for what that's worth.
In many languages this is also achievable by inlining the accessor method/s.
If the purpose of the object is to hold data in its fields, then yes. It would also make sense to have methods on the object which are (a) purely functional (in that they do not change the state of the object, or anything else); or (b) which manipulate the state of the object, and the point is that they manipulate the state in a particular way.
The kind of things that you should avoid are (c) methods that do things to other objects based on the state of the object (and certainly if there are assumptions about what is a "valid" state).

Should ecapsulated objects be public or private?

I'm a little unclear as to how far to take the idea in making all members within a class private and make public methods to handle mutations. Primitive types are not the issue, it's encapsulated object that I am unclear about. The benefit of making object members private is the ability to hide methods that do not apply to the context of class being built. The downside is that you have to provide public methods to pass parameters to the underlying object (more methods, more work). On the otherside, if you want to have all methods and properties exposed for the underlying object, couldn't you just make the object public? What are the dangers in having objects exposed this way?
For example, I would find it useful to have everything from a vector, or Array List, exposed. The only downside I can think of is that public members could potentially assigned a type that its not via implicit casting (or something to that affect). Would a volitile designation reduce the potential for problems?
Just a side note: I understand that true enapsulation implies that members are private.
What are the dangers in having objects exposed this way?
Changing the type of those objects would require changing the interface to the class. With private objects + public getters/setters, you'd only have to modify the code in the getters and setters, assuming you want to keep the things being returned the same.
Note that this is why properties are useful in languages such as Python, which technically doesn't have private class members, only obscured ones at most.
The problem with making instance variables public is that you can never change your mind later, and make them private, without breaking existing code that relies on directly public access to those instance vars. Some examples:
You decide to later make your class thread-safe by synchronizing all access to instance vars, or maybe by using a ThreadLocal to create a new copy of the value for each thread. Can't do it if any thread can directly access the variables.
Using your example of a vector or array list - at some point, you realize that there is a security flaw in your code because those classes are mutable, so somebody else can replace the contents of the list. If this were only available via an accessor method, you could easily solve the problem by making an immutable copy of the list upon request, but you can't do that with a public variable.
You realize later that one of your instance vars is redundant and can be derived based on other variables. Once again, easy if you're using accessors, impossible with public variables.
I think that it boils down to a practical point - if you know that you're the only one who will be using this code, and it pains you to write accessors (every IDE will do it for you automatically), and you don't mind changing your own code later if you decide to break the API, then go for it. But if other people will be using your class, or if you would like to make it easier to refactor later for your own use, stick with accessors.
Object oriented design is just a guideline. Think about it from the perspective of the person who will be using your class. Balance OOD with making it intuitive and easy to use.
You could run into issues depending on the language you are using and how it treats return statements or assignment operators. In some cases it may give you a reference, or values in other cases.
For example, say you have a PrimeCalculator class that figures out prime numbers, then you have another class that does something with those prime numbers.
public PrimeCalculator calculatorObject = new PrimeCalculator();
Vector<int> primeNumbers = calculatorObject.PrimeNumbersVector;
/* do something complicated here */
primeNumbers.clear(); // free up some memory
When you use this stuff later, possibly in another class, you don't want the overhead of calculating the numbers again so you use the same calculatorObject.
Vector<int> primes = calculatorObject.PrimeNumbersVector;
int tenthPrime = primes.elementAt(9);
It may not exactly be clear at this point whether primes and primeNumbers reference the same Vector. If they do, trying to get the tenth prime from primes would throw an error.
You can do it this way if you're careful and understand what exactly is happening in your situation, but you have a smaller margin of error using functions to return a value rather than assigning the variable directly.
Well you can check the post :
first this
then this
This should solve your confusion . It solved mine ! Thanks to Nicol Bolas.
Also read the comments below the accepted answer (also notice the link given in the second last comment by me ( in the first post) )
Also visit the wikipedia post

Explain to me what is a setter and getter

What are setters and getters? Why do I need them? What is a good example of them in use in an effective way? What is the point of a setter and getter?
Update:
Can I get some coding examples please?
A getter is a method that gets the value of a property. A setter is a method that sets the value of a property. There is some contention about their efficacy, but the points are generally:
for completeness of encapsulation
to maintain a consistent interface in case internal details change
More useful is when you need to add some logic around getting or setting, like validating a value before you write it.
A getter/setter is used to hide a private field from the publicity (you can avoid direct access to a field).
The getter allows you to check a provided value before you use it in your internal field. The setter allows you for instance to apply a different format or just to restrict write access (e.g. to derived classes).
A useful application of a getter can be some kind of lazy loading: The backing field (the private field that is hidden by the getter) is initialized to null. When you ask the getter to return the value, it will check for null and load the value with a more time consuming method. This will happen only the first call, later the getter will provide the already loaded value all the time.
Getters & setters separate interface (getter/setter functions) from implementation (how the data is actually stored).
Getters and Setters allow you to control how data members of an object can be accessed or changed.
In contrast, if you expose your data members directly to the user of the object, the user can change them at will, and the object wouldn't even know that they had been changed.
Don't want people to read a data member? Make the data member private, and don't write a getter that gives the value back. Don't want people to modify a data member? Make the data member private, and don't write a setter for it. Want to control the range of allowed values? Put that in the setter.
One question which might pop out of this is if using a method instead of a direct field access might decrease performance.
Answer is not really as compilers optimize code so that if your method is only doing return field;, where field is the field in your class that you hide with the setter/getter, it will actually access the field directly. Thus you get in most cases the same performance, at the same time keeping the option of later on change what set/get methods do.
Effective Java Programming of Joshua Block is a great book with tips on how to write good code, and explains why as well. Why using setter/getter is one of the hints.
Note: You might notice that in some books/documentation fields that present a setter/getter instead of being directly accessible are called 'properties' instead of fields. E.g. in C#, you can even specify that a field is a property and you don't need to define set/get anymore (nice feature I think).
public accessors(getter and setter) make sometimes sense.
(I'm annoyed that I have not only to document the member variable of a class but also the 2 mostly meaningless accessor methods. )
It usually doesn't help with encapsulation except in cases mentioned by Jason S.
An java example for some char loaded from a database but should be represented as a boolean value
char boolFromDb;
public boolean getBoolFromDb() {
return boolFromDb == 'T';
}
public void setBoolFromDb(boolean newValue) {
boolFromDb = newValue ? 'T' : 'F';
}

Is it better for class data to be passed internally or accessed directly?

Example:
// access fields directly
private void doThis()
{
return doSomeWork(this.data);
}
// receive data as an argument
private void doThis(data)
{
return doSomeWork(data);
}
The first option is coupled to the value in this.data while the second option avoids this coupling. I feel like the second option is always better. It promotes loose coupling WITHIN the class. Accessing global class data willy-nilly throughout just seems like a bad idea. Obviously this class data needs to be accessed directly at some point. However, if accesses, to this global class data can be eliminated by parameter passing, it seems that this is always preferable.
The second example has the advantage of working with any data of the proper type, whereas the first is bound to working with the just class data. Even if you don't NEED the additional flexibility, it seems nice to leave it as an option.
I just don't see any advantage in accessing member data directly from private methods as in the first example. Whats the best practice here? I've referenced code complete, but was not able to find anything on this particular issue.
if the data is part of the object's state, private/protected is just fine. option 1 - good.
i noticed some developers like to create private/protected vars just to pass parameters between methods in a class so that they dun have to pass them in the method call. they are not really to store the model/state of an object. ...then, option 1 - NOT good.
Why option 1 not good in this case...
expose only as much as you need (var scoping). so, pass the data in. do not create a private/protected var just to pass data between 2 methods.
private methods that figures out everything internally makes it very easy to understand. keep it this way, unless its unavoidable.
private/protected vars make it harder to refactor as your method is not 'self encompassing', it depends on external vars that might be used elsewhere.
my 2 cents! :-)
In class global data are not a problem IMHO. Classes are used to couple state, behaviour and identity. So such a coupling is not a problem. The argument suggests, that you can call that method with data from other objects, even of other classes and I think that should be more considered than coupling inside class.
They are both instance methods, therefore #1 makes more sense unless you have a situation involving threads (but depending on the language and scenario, even then you can simply lock/mark the data method as syncronized - my Java knowledge is rusty).
The second technique is more reminiscent of procedural programming.

Parameter vs. Member variables

I've recently been working with someone else's code and I realized that this individual has a very different philosophy regarding private variables and method parameters than I do. I generally feel that private variables should only be used in a case when:
The variable needs to be stored for recall later.
The data stored in the variable is used globally in the class.
When the variable needs to be globally manipulated (something decidedly different from the need to read the variable by every class method).
When it will make programming substantially easier. (Admittedly vague, but one has to be in many circumstances to avoid painting oneself into a corner).
(I admit, that many of the above are slightly repetitive, but they each seem different enough to merit such treatment... )
It just seems that this is the most efficient means of preventing changing a variable by accident. It also seems like following these standards will allow for the eventual manipulation of external references (if the class is eventually modified), thus leaving you with further options in the future. Is this simply a style issue (like one true bracket or Hungarian naming conventions), or do I have justification in this belief? Is there actually a best practice in this case?
edit
I think this needs to be corrected. I used "globally" above where I actually meant, "globally by instance methods" not "globally accessible by anything, anywhere".
edit2
An example was asked for:
class foo
{
private $_my_private_variable;
public function __constructor__()
{
}
public function useFoo( $variable )
{
// This is the line I am wondering about,
// there does not seem to be a need for storing it.
$this->_my_private_variable = $variable;
$this->_doSometing();
}
private function _doSomething()
{
/*
do something with $this->_my_private_variable.
*/
// This is the only place _my_private_variable is used.
echo $this->_my_private_variable;
}
}
This is the way I would have done it:
class foo
{
public function __constructor__()
{
}
public function useFoo( $variable )
{
$this->_doSometing( $variable );
}
private function _doSomething( $passed_variable )
{
/*
do something with the parameter.
*/
echo $passed_variable;
}
}
In general, class members should represent state of the class object.
They are not temporary locations for method parameters (that's what method parameters are for).
I claim that it isn't a style issue but rather a readability/maintainability issue. One variable should have one use, and one use only. “Recycling” variables for different purposes just because they happen to require the same type doesn't make any sense.
From your description it sounds as if the other person's code you worked on does exactly this, since all other uses are basically covered by your list. Put simply, it uses private member variables to act as temporaries depending on situation. Am I right to assume this? If so, the code is horrible.
The smaller the lexical scope and lifetime of any given variable, the less possiblity of erroneous use and the better for resource disposal.
Having a member variable implies that it will be holding state that needs to be held between method calls. If the value doesn't need to live between calls it has no reason to exist outside of the scope of a single call, and thus (if it exists at all) should be a variable within the method itself.
Style is always a hard one, once you develop one you can get stuck in a bit of a rut and it can be difficult to see why what you do may not be the best way.
You should only create variables when and where they are needed, and dispose of them when you are done. If the class doesn't need a class level variable to function, then it just doesn't need one. Creating variables where you don't need them is very bad practice.
Class members should be any of the following:
A dependency of a class
A variable that represents the state of the class
A method of the class
I think the answer is straightforward if you are familiar with C++ destructors. All member variables should be assigned a way to be destructed while function parameters are not. So that's why member variables are usually the states or dependicies of an object having some kind of relation regarding their lifecycle.
I'm not sure there is a stated best-practice for using globally scoped variables versus always passing as method parameters. (By "private variables", I'm assuming you mean globally scoped variables.)
Using a globally scoped variable is the only way to implement properties in .NET (even automatic properties ultimately use a globally scoped variable, just not one you have to declare yourself).
There is a line of arguement for always using method parameters because it makes it completely clear where the value is coming from. I don't think it really helps prevent the method from making changes to the underlying value and it can, in my opinion, make things more difficult to read at times.
I would disagree with implementing it for global access or to make programming easier. By exposing these globally without filtering of any kind make it more difficult to determine access in the future.
Since object properties are meant to hold state, as stated by the others, my policy is to have all of them private by default unless I have a good reason to expose them.
It's much easier to make them public later on, if you have to, simply by writing a getter method for example (which i also don't have to think about right at the beginning of writing a class). But reeling in a public property later on may require a huge amount of code to be re-written.
I like to keep it flexible while not having to think about this more than needed.