How to avoid duplicating a property in a nested class structure? - oop

For example, I have the following class structure:
Animal
---Cat
------property: CatCollar (boolean)
---Dog
------BigDog
------TinyDog
---------property: CatCollar (boolean)
I have the same property CatCollar in class Cat and in class TinyDog, but this property should not be in the class BigDog. My reviewer tells me that this is a bad structure, as it leads to duplication. I cannot change the structure of classes, but I can only change this property (location and other manipulation). Maybe there are some OOP tools that allow you to do this? Can I somehow avoid duplicating a property? If so, how?

In object-orientation it doesn't matter what "data" (i.e. private state) each object has. It only matters what behavior they provide, even in an inheritance tree.
So, if by "property" you mean a public state, or a private state accessible through a public getter, you already left object-orientation to a certain degree. Discussing what is right from the perspective of oop is already moot.
If you mean that as a private state, with some potential behavior shared, then you might need delegation. That is, both Cat and TinyDog implementing some interface describing the desired behavior, then implementing it in some class that both delegate to (i.e. contain). Unfortunately this is not natively supported in Java (in Kotlin for example it is), so it needs some boilerplate.

Related

What's the best practice for super class property definition in objective-c

Here is something I have scratched off my head and could not make it clear.
In Objective-C there is no concept of "Protected" when defines the property in the super class. (Some people may argue that they could achieve similar behaviour by using #ifdef, but this is not something I am considering here.) So that means a property is either public or private.
For all the subclasses, if it want to accessing one of the private property defined by its superclass. The property has to be defined for public in the super class,which doesn't make much sense to me, since the thumb rule of OOP is always try to keep as much as privacy as possible.
For example,
If a class called Vehicle has a private property "Wheel *wheel" and public method "-(void) run;".
Why could not its subclass easily access the "Wheel *wheel" without let Vehicle reveal such property in the public declaration since all the outsiders need to know is only "it can run"

Scala and encapsulation?

Since I started to study OOP encapsulation was always something that raised questions to me. Getters and setters in 99% of the cases seemed like a big lie: what does it matter to have setter if it changes the reference of the private field, and getter that returns reference to mutable object? Of course there are many things that make life easier with getters and setters pattern (like Hibernate that creates proxies on entities). In Scala there is some kind of solution: don't lie to yourself, if your field is val you have nothing to fear of and just make it public.
Still this doesn't solve the question of methods, should I ever declare a method private in Scala? Why would I declare a method private in Java? Mostly if it's a helper method and I don't want to pollute my class namespace, and if the method changes our internal state. The second issue doesn't apply (mostly & hopefully) to Scala, and the first one could be simply solved with appropriate traits. So when would I want to declare a method private in Scala? What is the convention for encapsulation in Scala? I would highly appreciate if you help me to order my thoughts on subject.
Getters and setters (or accessor/mutator methods) are used to encapsulate data, which is commonly considered one of the tenets of OOP.
They exist so that the underlying implementation of an object can change without compromising client code, as long as the interface contract remains unchanged.
This is a principle aiming to simplify maintenance and evolution of the codebase.
Even Scala has encapsulation, but it supports the Uniform Access Principle by avoiding explicit use of get/set (a JavaBean convention) by automatically creating accessor/mutator methods that mimics the attribute name (e.g. for a public val name attribute a corresponding def name public accessor is generated and for a var name you also have the def name_= mutator method).
For example if you define
class Encapsulation(hidden: Any, val readable: Any, var settable: Any)
the compiled .class is as follows
C:\devel\scala_code\stackoverflow>javap -cp . Encapsulation
Compiled from "encapsulation.scala"
public class Encapsulation {
public java.lang.Object readable();
public java.lang.Object settable();
public void settable_$eq(java.lang.Object);
public Encapsulation(java.lang.Object, java.lang.Object, java.lang.Object)
}
Scala is simply designed to avoid boilerplate by removing the necessity to define such methods.
Encapsulation (or information hiding) was not invented to support Hibernate or other frameworks. In fact in Hibernate you should be able to annotate the attribute field directly, all the while effectively breaking encapsulation.
As for the usefulness of private methods, it's once again a good design principle that leads to DRY code (if you have more than one method sharing a piece of logic), to better focusing the responsibility of each method, and to enable different composition of the same pieces.
This should be a general guideline for every method you define, and only a part of the encapsulated logic would come out at the public interface layer, leaving you with the rest being implemented as private (or even local) methods.
In scala (as in java) private constructors also allows you to restrict the way an object is instantiated through the use of factory methods.
Encapsulation is not only a matter of getter/setter methods or public/private accessor modifiers. That's a common misconception amongst Java developer who had to spend to much time with Hibernate (or similar JavaBean Specification based libraries).
In object-oriented programming, encapsulation not only refers to information hiding but it also refers to bundling both the data and the methods (operating on that data) together in the same object.
To achieve good encapsulation, there must a clear distinction between the those methods you wish to expose to the public (the so called public interface) and the internal state of an object which must comply with its data invariants.
In Scala the are many ways to achieve object-oriented encapulation. For example, one of my preferred is:
trait AnInterface {
def aMethod(): AType
}
object AnInterface {
def apply() = new AnHiddenImplementation()
private class AnHiddenImplementation {
var aVariable: AType = _
def aMethod(): AType = {
// operate on the internal aVariable
}
}
}
Firstly, define the trait (the public interface) so to make immediately clear what the clients will see. Then write its companion object to provide a factory method which instantiate a default concrete implementation. That implementation can be completely hidden from clients if defined private inside the companion object.
As you can see the Scala code is much more concise of any Java solution

Reasons to use private instead of protected for fields and methods

This is a rather basic OO question, but one that's been bugging me for some time.
I tend to avoid using the 'private' visibility modifier for my fields and methods in favor of protected.
This is because, generally, I don't see any use in hiding the implementation between base class and child class, except when I want to set specific guidelines for the extension of my classes (i.e. in frameworks). For the majority of cases I think trying to limit how my class will be extended either by me or by other users is not beneficial.
But, for the majority of people, the private modifier is usually the default choice when defining a non-public field/method.
So, can you list use cases for private? Is there a major reason for always using private? Or do you also think it's overused?
There is some consensus that one should prefer composition over inheritance in OOP. There are several reasons for this (google if you're interested), but the main part is that:
inheritance is seldom the best tool and is not as flexible as other solutions
the protected members/fields form an interface towards your subclasses
interfaces (and assumptions about their future use) are tricky to get right and document properly
Therefore, if you choose to make your class inheritable, you should do so conciously and with all the pros and cons in mind.
Hence, it's better not to make the class inheritable and instead make sure it's as flexible as possible (and no more) by using other means.
This is mostly obvious in larger frameworks where your class's usage is beyond your control. For your own little app, you won't notice this as much, but it (inheritance-by-default) will bite you in the behind sooner or later if you're not careful.
Alternatives
Composition means that you'd expose customizability through explicit (fully abstract) interfaces (virtual or template-based).
So, instead of having an Vehicle base class with a virtual drive() function (along with everything else, such as an integer for price, etc.), you'd have a Vehicle class taking a Motor interface object, and that Motor interface only exposes the drive() function. Now you can add and re-use any sort of motor anywhere (more or less. :).
There are two situations where it matters whether a member is protected or private:
If a derived class could benefit from using a member, making the member `protected` would allow it to do so, while making it `private` would deny it that benefit.
If a future version of the base class could benefit by not having the member behave as it does in the present version, making the member `private` would allow that future version to change the behavior (or eliminate the member entirely), while making it `protected` would require all future versions of the class to keep the same behavior, thus denying them the benefit that could be reaped from changing it.
If one can imagine a realistic scenario where a derived class might benefit from being able to access the member, and cannot imagine a scenario where the base class might benefit from changing its behavior, then the member should be protected [assuming, of course, that it shouldn't be public]. If one cannot imagine a scenario where a derived class would get much benefit from accessing the member directly, but one can imagine scenarios where a future version of the base class might benefit by changing it, then it should be private. Those cases are pretty clear and straightforward.
If there isn't any plausible scenario where the base class would benefit from changing the member, I would suggest that one should lean toward making it protected. Some would say the "YAGNI" (You Ain't Gonna Need It) principle favors private, but I disagree. If you're is expecting others to inherit the class, making a member private doesn't assume "YAGNI", but rather "HAGNI" (He's Not Gonna Need It). Unless "you" are going to need to change the behavior of the item in a future version of the class, "you" ain't gonna need it to be private. By contrast, in many cases you'll have no way of predicting what consumers of your class might need. That doesn't mean one should make members protected without first trying to identify ways one might benefit from changing them, since YAGNI isn't really applicable to either decision. YAGNI applies in cases where it will be possible to deal with a future need if and when it is encountered, so there's no need to deal with it now. A decision to make a member of a class which is given to other programmers private or protected implies a decision as to which type of potential future need will be provided for, and will make it difficult to provide for the other.
Sometimes both scenarios will be plausible, in which case it may be helpful to offer two classes--one of which exposes the members in question and a class derived from that which does not (there's no standard idiomatic was for a derived class to hide members inherited from its parent, though declaring new members which have the same names but no compilable functionality and are marked with an Obsolete attribute would have that effect). As an example of the trade-offs involved, consider List<T>. If the type exposed the backing array as a protected member, it would be possible to define a derived type CompareExchangeableList<T> where T:Class which included a member T CompareExchangeItem(index, T T newValue, T oldvalue) which would return Interlocked.CompareExchange(_backingArray[index], newValue, oldValue); such a type could be used by any code which expected a List<T>, but code which knew the instance was a CompareExchangeableList<T> could use the CompareExchangeItem on it. Unfortunately, because List<T> does not expose the backing array to derived classes, it is impossible to define a type which allows CompareExchange on list items but which would still be useable by code expecting a List<T>.
Still, that's not to imply that exposing the backing array would have been completely without cost; even though all extant implementations of List<T> use a single backing array, Microsoft might implement future versions to use multiple arrays when a list would otherwise grow beyond 84K, so as to avoid the inefficiencies associated with the Large Object Heap. If the backing array was exposed as protected member, it would be impossible to implement such a change without breaking any code that relied upon that member.
Actually, the ideal thing might have been to balance those interests by providing a protected member which, given a list-item index, will return an array segment which contains the indicated item. If there's only one array, the method would always return a reference to that array, with an offset of zero, a starting subscript of zero, and a length equal to the list length. If a future version of List<T> split the array into multiple pieces, the method could allow derived classes to efficiently access segments of the array in ways that would not be possible without such access [e.g. using Array.Copy] but List<T> could change the way it manages its backing store without breaking properly-written derived classes. Improperly-written derived classes could get broken if the base implementation changes, but that's the fault of the derived class, not the base.
I just prefer private than protected in the default case because I'm following the principle to hide as much as possibility and that's why set the visibility as low as possible.
I am reaching here. However, I think that the use of Protected member variables should be made conciously, because you not only plan to inherit, but also because there is a solid reason derived classed shouldn't use the Property Setters/Getters defined on the base class.
In OOP, we "encapsulate" the member fields so that we can excercise control over how they properties the represent are accessed and changed. When we define a getter/setter on our base for a member variable, we are essentially saying that THIS is how I want this variable to be referenced/used.
While there are design-driven exceptions in which one might need to alter the behavior created in the base class getter/setter methods, it seems to me that this would be a decision made after careful consideration of alternatives.
For Example, when I find myself needing to access a member field from a derived class directly, instead of through the getter/setter, I start thinking maybe that particular Property should be defined as abstract, or even moved to the derived class. This depends upon how broad the hierarchy is, and any number of additional considerations. But to me, stepping around the public Property defined on the base class begins to smell.
Of course, in many cases, it "doesn't matter" because we are not implementing anything within the getter/setter beyond access to the variable. But again, if this is the case, the derived class can just as easily access through the getter/setter. This also protects against hard-to-find bugs later, if employed consistently. If the behgavior of the getter/setter for a member field on the base class is changed in some way, and a derived class references the Protected field directly, there is the potential for trouble.
You are on the right track. You make something private, because your implementation is dependant on it not being changed either by a user or descendant.
I default to private and then make a conscious decision about whether and how much of the inner workings I'm going to expose, you seem to work on the basis, that it will be exposed anyway, so get on with it. As long as we both remember to cross all the eyes and dot all the tees, we are good.
Another way to look at it is this.
If you make it private, some one might not be able to do what they want with your implementation.
If you don't make it private, someone may be able to do something you really don't want them to do with your implementation.
I've been programming OOP since C++ in 1993 and Java in 1995. Time and again I've seen a need to augment or revise a class, typically adding extra functionality tightly integrated with the class. The OOP way to do so is to subclass the base class and make the changes in the subclass. For example a base class field originally referred to only elsewhere in the base class is needed for some other action, or some other activity must change a value of the field (or one of the field's contained members). If that field is private in the base class then the subclass cannot access it, cannot extend the functionality. If the field is protected it can do so.
Subclasses have a special relationship to the base class that other classes elsewhere in the class hierarchy don't have: they inherit the base class members. The purpose of inheritance is to access base class members; private thwarts inheritance. How is the base class developer supposed to know that no subclasses will ever need to access a member? In some cases that can be clear, but private should be the exception rather than the rule. Developers subclassing the base class have the base class source code, so their alternative is to revise the base class directly (perhaps just changing private status to protected before subclassing). That's not clean, good practice, but that's what private makes you do.
I am a beginner at OOP but have been around since the first articles in ACM and IEEE. From what I remember, this style of development was more for modelling something. In the real world, things including processes and operations would have "private, protected, and public" elements. So to be true to the object .....
Out side of modelling something, programming is more about solving a problem. The issue of "private, protected, and public" elements is only a concern when it relates to making a reliable solution. As a problem solver, I would not make the mistake of getting cough up in how others are using MY solution to solve their own problems. Now keep in mind that a main reason for the issue of ...., was to allow a place for data checking (i.e., verifying the data is in a valid range and structure before using it in your object).
With that in mind, if your code solves the problem it was designed for, you have done your job. If others need your solution to solve the same or a simular problem - Well, do you really need to control how they do it. I would say, "only if you are getting some benefit for it or you know the weaknesses in your design, so you need to protect some things."
In my idea, if you are using DI (Dependency Injection) in your project and you are using it to inject some interfaces in your class (by constructor) to use them in your code, then they should be protected, cause usually these types of classes are more like services not data keepers.
But if you want to use attributes to save some data in your class, then privates would be better.

A use for multiple inheritance?

Can anyone think of any situation to use multiple inheritance? Every case I can think of can be solved by the method operator
AnotherClass() { return this->something.anotherClass; }
Most uses of full scale Multiple inheritance are for mixins. As an example:
class DraggableWindow : Window, Draggable { }
class SkinnableWindow : Window, Skinnable { }
class DraggableSkinnableWindow : Window, Draggable, Skinnable { }
etc...
In most cases, it's best to use multiple inheritance to do strictly interface inheritance.
class DraggableWindow : Window, IDraggable { }
Then you implement the IDraggable interface in your DraggableWindow class. It's WAY too hard to write good mixin classes.
The benefit of the MI approach (even if you are only using Interface MI) is that you can then treat all kinds of different Windows as Window objects, but have the flexibility to create things that would not be possible (or more difficult) with single inheritance.
For example, in many class frameworks you see something like this:
class Control { }
class Window : Control { }
class Textbox : Control { }
Now, suppose you wanted a Textbox with Window characteristics? Like being dragable, having a titlebar, etc... You could do something like this:
class WindowedTextbox : Control, IWindow, ITexbox { }
In the single inheritance model, you can't easily inherit from both Window and Textbox without having some problems with duplicate Control objects and other kinds of problems. You can also treat a WindowedTextbox as a Window, a Textbox, or a Control.
Also, to address your .anotherClass() idiom, .anotherClass() returns a different object, while multiple inheritance allows the same object to be used for different purposes.
I find multiple inheritance particularly useful when using mixin classes.
As stated in Wikipedia:
In object-oriented programming
languages, a mixin is a class that
provides a certain functionality to be
inherited by a subclass, but is not
meant to stand alone.
An example of how our product uses mixin classes is for configuration save and restore purposes. There is an abstract mixin class which defines a set of pure virtual methods. Any class which is saveable inherits from the save/restore mixin class which automatically gives them the appropriate save/restore functionality.
But they may also inherit from other classes as part of their normal class structure, so it is quite common for these classes to use multiple inheritance in this respect.
An example of multiple inheritance:
class Animal
{
virtual void KeepCool() const = 0;
}
class Vertebrate
{
virtual void BendSpine() { };
}
class Dog : public Animal, public Vertebrate
{
void KeepCool() { Pant(); }
}
What is most important when doing any form of public inheritance (single or multiple) is to respect the is a relationship. A class should only inherit from one or more classes if it "is" one of those objects. If it simply "contains" one of those objects, aggregation or composition should be used instead.
The example above is well structured because a dog is an animal, and also a vertebrate.
Most people use multiple-inheritance in the context of applying multiple interfaces to a class. This is the approach Java and C#, among others, enforce.
C++ allows you to apply multiple base classes fairly freely, in an is-a relationship between types. So, you can treat a derived object like any of its base classes.
Another use, as LeopardSkinPillBoxHat points out, is in mix-ins. An excellent example of this is the Loki library, from Andrei Alexandrescu's book Modern C++ Design. He uses what he terms policy classes that specify the behavior or the requirements of a given class through inheritance.
Yet another use is one that simplifies a modular approach that allows API-independence through the use of sister-class delegation in the oft-dreaded diamond hierarchy.
The uses for MI are many. The potential for abuse is even greater.
Java has interfaces. C++ has not.
Therefore, multiple inheritance can be used to emulate the interface feature.
If you're a C# and Java programmer, every time you use a class that extends a base class but also implements a few interfaces, you are sort of admitting multiple inheritance can be useful in some situations.
I think it would be most useful for boilerplate code. For example, the IDisposable pattern is exactly the same for all classes in .NET. So why re-type that code over and over again?
Another example is ICollection. The vast majority of the interface methods are implemented exactly the same. There are only a couple of methods that are actually unique to your class.
Unfortunately multiple-inheritance is very easy to abuse. People will quickly start doing silly things like LabelPrinter class inherit from their TcpIpConnector class instead of merely contain it.
One case I worked on recently involved network enabled label printers. We need to print labels, so we have a class LabelPrinter. This class has virtual calls for printing several different labels. I also have a generic class for TCP/IP connected things, which can connect, send and receive.
So, when I needed to implement a printer, it inherited from both the LabelPrinter class and the TcpIpConnector class.
I think fmsf example is a bad idea. A car is not a tire or an engine. You should be using composition for that.
MI (of implementation or interface) can be used to add functionality. These are often called mixin classes.. Imagine you have a GUI. There is view class that handles drawing and a Drag&Drop class that handles dragging. If you have an object that does both you would have a class like
class DropTarget{
public void Drop(DropItem & itemBeingDropped);
...
}
class View{
public void Draw();
...
}
/* View you can drop items on */
class DropView:View,DropTarget{
}
It is true that composition of an interface (Java or C# like) plus forwarding to a helper can emulate many of the common uses of multiple inheritance (notably mixins). However this is done at the cost of that forwarding code being repeated (and violating DRY).
MI does open a number of difficult areas, and more recently some language designers have taken decisions that the potential pitfalls of MI outweigh the benefits.
Similarly one can argue against generics (heterogeneous containers do work, loops can be replaced with (tail) recursion) and almost any other feature of programming languages. Just because it is possible to work without a feature does not mean that that feature is valueless or cannot help to effectively express solutions.
A rich diversity of languages, and language families makes it easier for us as developers to pick good tools that solve the business problem at hand. My toolbox contains many items I rarely use, but on those occasions I do not want to treat everything as a nail.
An example of how our product uses mixin classes is for configuration save and restore purposes. There is an abstract mixin class which defines a set of pure virtual methods. Any class which is saveable inherits from the save/restore mixin class which automatically gives them the appropriate save/restore functionality.
This example doesn't really illustrate the usefulness of multiple inheritance. What being defined here is an INTERFACE. Multiple inheritance allows you to inherit behavior as well. Which is the point of mixins.
An example; because of a need to preserve backwards compatibility I have to implement my own serialization methods.
So every object gets a Read and Store method like this.
Public Sub Store(ByVal File As IBinaryWriter)
Public Sub Read(ByVal File As IBinaryReader)
I also want to be able to assign and clone object as well. So I would like this on every object.
Public Sub Assign(ByVal tObject As <Class_Name>)
Public Function Clone() As <Class_Name>
Now in VB6 I have this code repeated over and over again.
Public Assign(ByVal tObject As ObjectClass)
Me.State = tObject.State
End Sub
Public Function Clone() As ObjectClass
Dim O As ObjectClass
Set O = New ObjectClass
O.State = Me.State
Set Clone = 0
End Function
Public Property Get State() As Variant
StateManager.Clear
Me.Store StateManager
State = StateManager.Data
End Property
Public Property Let State(ByVal RHS As Variant)
StateManager.Data = RHS
Me.Read StateManager
End Property
Note that Statemanager is a stream that read and stores byte arrays.
This code is repeated dozens of times.
Now in .NET i am able to get around this by using a combination of generics and inheritance. My object under the .NET version get Assign, Clone, and State when they inherit from MyAppBaseObject. But I don't like the fact that every object inherits from MyAppBaseObject.
I rather just mix in the the Assign Clone interface AND BEHAVIOR. Better yet mix in separately the Read and Store interface then being able to mix in Assign and Clone. It would be cleaner code in my opinion.
But the times where I reuse behavior are DWARFED by the time I use Interface. This is because the goal of most object hierarchies are NOT about reusing behavior but precisely defining the relationship between different objects. Which interfaces are designed for. So while it would be nice that C# (or VB.NET) had some ability to do this it isn't a show stopper in my opinion.
The whole reason that this is even an issue that that C++ fumbled the ball at first when it came to the interface vs inheritance issue. When OOP debuted everybody thought that behavior reuse was the priority. But this proved to be a chimera and only useful for specific circumstances, like making a UI framework.
Later the idea of mixins (and other related concepts in aspect oriented programming) were developed. Multiple inheritance was found useful in creating mix-ins. But C# was developed just before this was widely recognized. Likely an alternative syntax will be developed to do this.
I suspect that in C++, MI is best use as part of a framework (the mix-in classes previously discussed). The only thing I know for sure is that every time I've tried to use it in my apps, I've ended up regretting the choice, and often tearing it out and replacing it with generated code.
MI is one more of those 'use it if you REALLY need it, but make sure you REALLY need it' tools.
The following example is mostly something I see often in C++: sometimes it may be necessary due to utility classes that you need but because of their design cannot be used through composition (at least not efficiently or without making the code even messier than falling back on mult. inheritance). A good example is you have an abstract base class A and a derived class B, and B also needs to be a kind of serializable class, so it has to derive from, let's say, another abstract class called Serializable. It's possible to avoid MI, but if Serializable only contains a few virtual methods and needs deep access to the private members of B, then it may be worth muddying the inheritance tree just to avoid making friend declarations and giving away access to B's internals to some helper composition class.
I had to use it today, actually...
Here was my situation - I had a domain model represented in memory where an A contained zero or more Bs(represented in an array), each B has zero or more Cs, and Cs to Ds. I couldn't change the fact that they were arrays (the source for these arrays were from automatically generated code from the build process). Each instance needed to keep track of which index in the parent array they belonged in. They also needed to keep track of the instance of their parent (too much detail as to why). I wrote something like this (there was more to it, and this is not syntactically correct, it's just an example):
class Parent
{
add(Child c)
{
children.add(c);
c.index = children.Count-1;
c.parent = this;
}
Collection<Child> children
}
class Child
{
Parent p;
int index;
}
Then, for the domain types, I did this:
class A : Parent
class B : Parent, Child
class C : Parent, Child
class D : Child
The actually implementation was in C# with interfaces and generics, and I couldn't do the multiple inheritance like I would have if the language supported it (some copy paste had to be done). So, I thought I'd search SO to see what people think of multiple inheritance, and I got your question ;)
I couldn't use your solution of the .anotherClass, because of the implementation of add for Parent (references this - and I wanted this to not be some other class).
It got worse because the generated code had A subclass something else that was neither a parent or a child...more copy paste.

Should protected attributes always be banned?

I seldom use inheritance, but when I do, I never use protected attributes because I think it breaks the encapsulation of the inherited classes.
Do you use protected attributes ? what do you use them for ?
In this interview on Design by Bill Venners, Joshua Bloch, the author of Effective Java says:
Trusting Subclasses
Bill Venners: Should I trust subclasses more intimately than
non-subclasses? For example, do I make
it easier for a subclass
implementation to break me than I
would for a non-subclass? In
particular, how do you feel about
protected data?
Josh Bloch: To write something that is both subclassable and robust
against a malicious subclass is
actually a pretty tough thing to do,
assuming you give the subclass access
to your internal data structures. If
the subclass does not have access to
anything that an ordinary user
doesn't, then it's harder for the
subclass to do damage. But unless you
make all your methods final, the
subclass can still break your
contracts by just doing the wrong
things in response to method
invocation. That's precisely why the
security critical classes like String
are final. Otherwise someone could
write a subclass that makes Strings
appear mutable, which would be
sufficient to break security. So you
must trust your subclasses. If you
don't trust them, then you can't allow
them, because subclasses can so easily
cause a class to violate its
contracts.
As far as protected data in general,
it's a necessary evil. It should be
kept to a minimum. Most protected data
and protected methods amount to
committing to an implementation
detail. A protected field is an
implementation detail that you are
making visible to subclasses. Even a
protected method is a piece of
internal structure that you are making
visible to subclasses.
The reason you make it visible is that
it's often necessary in order to allow
subclasses to do their job, or to do
it efficiently. But once you've done
it, you're committed to it. It is now
something that you are not allowed to
change, even if you later find a more
efficient implementation that no
longer involves the use of a
particular field or method.
So all other things being equal, you
shouldn't have any protected members
at all. But that said, if you have too
few, then your class may not be usable
as a super class, or at least not as
an efficient super class. Often you
find out after the fact. My philosophy
is to have as few protected members as
possible when you first write the
class. Then try to subclass it. You
may find out that without a particular
protected method, all subclasses will
have to do some bad thing.
As an example, if you look at
AbstractList, you'll find that there
is a protected method to delete a
range of the list in one shot
(removeRange). Why is that in there?
Because the normal idiom to remove a
range, based on the public API, is to
call subList to get a sub-List,
and then call clear on that
sub-List. Without this particular
protected method, however, the only
thing that clear could do is
repeatedly remove individual elements.
Think about it. If you have an array
representation, what will it do? It
will repeatedly collapse the array,
doing order N work N times. So it will
take a quadratic amount of work,
instead of the linear amount of work
that it should. By providing this
protected method, we allow any
implementation that can efficiently
delete an entire range to do so. And
any reasonable List implementation
can delete a range more efficiently
all at once.
That we would need this protected
method is something you would have to
be way smarter than me to know up
front. Basically, I implemented the
thing. Then, as we started to subclass
it, we realized that range delete was
quadratic. We couldn't afford that, so
I put in the protected method. I think
that's the best approach with
protected methods. Put in as few as
possible, and then add more as needed.
Protected methods represent
commitments to designs that you may
want to change. You can always add
protected methods, but you can't take
them out.
Bill Venners: And protected data?
Josh Bloch: The same thing, but even more. Protected data is even more
dangerous in terms of messing up your
data invariants. If you give someone
else access to some internal data,
they have free reign over it.
Short version: it breaks encapsulation but it's a necessary evil that should be kept to a minimum.
C#:
I use protected for abstract or virtual methods that I want base classes to override. I also make a method protected if it may be called by base classes, but I don't want it called outside the class hierarchy.
You may need them for static (or 'global') attribute you want your subclasses or classes from same package (if it is about java) to benefit from.
Those static final attributes representing some kind of 'constant value' have seldom a getter function, so a protected static final attribute might make sense in that case.
Scott Meyers says don't use protected attributes in Effective C++ (3rd ed.):
Item 22: Declare data members private.
The reason is the same you give: it breaks encapsulations. The consequence is that otherwise local changes to the layout of the class might break dependent types and result in changes in many other places.
I don't use protected attributes in Java because they are only package protected there. But in C++, I'll use them in abstract classes, allowing the inheriting class to inherit them directly.
There are never any good reasons to have protected attributes. A base class must be able to depend on state, which means restricting access to data through accessor methods. You can't give anyone access to your private data, even children.
I recently worked on a project were the "protected" member was a very good idea. The class hiearchy was something like:
[+] Base
|
+--[+] BaseMap
| |
| +--[+] Map
| |
| +--[+] HashMap
|
+--[+] // something else ?
The Base implemented a std::list but nothing else. The direct access to the list was forbidden to the user, but as the Base class was incomplete, it relied anyway on derived classes to implement the indirection to the list.
The indirection could come from at least two flavors: std::map and stdext::hash_map. Both maps will behave the same way but for the fact the hash_map needs the Key to be hashable (in VC2003, castable to size_t).
So BaseMap implemented a TMap as a templated type that was a map-like container.
Map and HashMap were two derived classes of BaseMap, one specializing BaseMap on std::map, and the other on stdext::hash_map.
So:
Base was not usable as such (no public accessors !) and only provided common features and code
BaseMap needed easy read/write to a std::list
Map and HashMap needed easy read/write access to the TMap defined in BaseMap.
For me, the only solution was to use protected for the std::list and the TMap member variables. There was no way I would put those "private" because I would anyway expose all or almost all of their features through read/write accessors anyway.
In the end, I guess that if you en up dividing your class into multiple objects, each derivation adding needed features to its mother class, and only the most derived class being really usable, then protected is the way to go. The fact the "protected member" was a class, and so, was almost impossible to "break", helped.
But otherwise, protected should be avoided as much as possible (i.e.: Use private by default, and public when you must expose the method).
The protected keyword is a conceptual error and language design botch, and several modern languages, such as Nim and Ceylon (see http://ceylon-lang.org/documentation/faq/language-design/#no_protected_modifier), that have been carefully designed rather than just copying common mistakes, don't have such a keyword.
It's not protected members that breaks encapsulation, it's exposing members that shouldn't be exposed that breaks encapsulation ... it doesn't matter whether they are protected or public. The problem with protected is that it is wrongheaded and misleading ... declaring members protected (rather than private) doesn't protect them, it does the opposite, exactly as public does. A protected member, being accessible outside the class, is exposed to the world and so its semantics must be maintained forever, just as is the case for public. The whole idea of "protected" is nonsense ... encapsulation is not security, and the keyword just furthers the confusion between the two. You can help a little by avoiding all uses of protected in your own classes -- if something is an internal part of the implementation, isn't part of the class's semantics, and may change in the future, then make it private or internal to your package, module, assembly, etc. If it is an unchangeable part of the class semantics, then make it public, and then you won't annoy users of your class who can see that there's a useful member in the documentation but can't use it, unless they are creating their own instances and can get at it by subclassing.
In general, no you really don't want to use protected data members. This is doubly true if your writing an API. Once someone inherits from your class you can never really do maintenance and not somehow break them in a weird and sometimes wild way.
I use them. In short, it's a good way, if you want to have some attributes shared. Granted, you could write set/get functions for them, but if there is no validation, then what's the point? It's also faster.
Consider this: you have a class which is your base class. It has quite a few attributes you wan't to use in the child objects. You could write a get/set function for each, or you can just set them.
My typical example is a file/stream handler. You want to access the handler (i.e. file descriptor), but you want to hide it from other classes. It's way easier than writing a set/get function for it.
I think protected attributes are a bad idea. I use CheckStyle to enforce that rule with my Java development teams.
In general, yes. A protected method is usually better.
In use, there is a level of simplicity given by using a protected final variable for an object that is shared by all the children of a class. I'd always advise against using it with primitives or collections since the contracts are impossible to define for those types.
Lately I've come to separate stuff you do with primitives and raw collections from stuff you do with well-formed classes. Primitives and collections should ALWAYS be private.
Also, I've started occasionally exposing public member variables when they are declaired final and are well-formed classes that are not too flexible (again, not primitives or collections).
This isn't some stupid shortcut, I thought it out pretty seriously and decided there is absolutely no difference between a public final variable exposing an object and a getter.
It depends on what you want. If you want a fast class then data should be protected and use protected and public methods.
Because I think you should assume that your users who derive from your class know your class quite well or at least they have read your manual at the function they going to override.
If your users mess with your class it is not your problem. Every malicious user can add the following lines when overriding one of your virtuals:
(C#)
static Random rnd=new Random();
//...
if (rnd.Next()%1000==0) throw new Exception("My base class sucks! HAHAHAHA! xD");
//...
You can't seal every class to prevent this.
Of course if you want a constraint on some of your fields then use accessor functions or properties or something you want and make that field private because there is no other solution...
But I personally don't like to stick to the oop principles at all costs. Especially making properties with the only purpose to make data members private.
(C#):
private _foo;
public foo
{
get {return _foo;}
set {_foo=value;}
}
This was my personal opinion.
But do what your boss require (if he wants private fields than do that.)
I use protected variables/attributes within base classes that I know I don't plan on changing into methods. That way, subclasses have full access to their inherited variables, and don't have the (artificially created) overhead of going through getters/setters to access them. An example is a class using an underlying I/O stream; there is little reason not to allow subclasses direct access to the underlying stream.
This is fine for member variables that are used in direct simple ways within the base class and all subclasses. But for a variable that has a more complicated use (e.g., accessing it causes side effects in other members within the class), a directly accessible variable is not appropriate. In this case, it can be made private and public/protected getters/setters can be provided instead. An example is an internal buffering mechanism provided by the base class, where accessing the buffers directly from a subclass would compromise the integrity of the algorithms used by the base class to manage them.
It's a design judgment decision, based on how simple the member variable is, and how it is expected to be so in future versions.
Encapsulation is great, but it can be taken too far. I've seen classes whose own private methods accessed its member variables using only getter/setter methods. This is overkill, since if a class can't trust its own private methods with its own private data, who can it trust?