Related
I have seen this mentioned a few times and I am not clear on what it means. When and why would you do this?
I know what interfaces do, but the fact I am not clear on this makes me think I am missing out on using them correctly.
Is it just so if you were to do:
IInterface classRef = new ObjectWhatever()
You could use any class that implements IInterface? When would you need to do that? The only thing I can think of is if you have a method and you are unsure of what object will be passed except for it implementing IInterface. I cannot think how often you would need to do that.
Also, how could you write a method that takes in an object that implements an interface? Is that possible?
There are some wonderful answers on here to this questions that get into all sorts of great detail about interfaces and loosely coupling code, inversion of control and so on. There are some fairly heady discussions, so I'd like to take the opportunity to break things down a bit for understanding why an interface is useful.
When I first started getting exposed to interfaces, I too was confused about their relevance. I didn't understand why you needed them. If we're using a language like Java or C#, we already have inheritance and I viewed interfaces as a weaker form of inheritance and thought, "why bother?" In a sense I was right, you can think of interfaces as sort of a weak form of inheritance, but beyond that I finally understood their use as a language construct by thinking of them as a means of classifying common traits or behaviors that were exhibited by potentially many non-related classes of objects.
For example -- say you have a SIM game and have the following classes:
class HouseFly inherits Insect {
void FlyAroundYourHead(){}
void LandOnThings(){}
}
class Telemarketer inherits Person {
void CallDuringDinner(){}
void ContinueTalkingWhenYouSayNo(){}
}
Clearly, these two objects have nothing in common in terms of direct inheritance. But, you could say they are both annoying.
Let's say our game needs to have some sort of random thing that annoys the game player when they eat dinner. This could be a HouseFly or a Telemarketer or both -- but how do you allow for both with a single function? And how do you ask each different type of object to "do their annoying thing" in the same way?
The key to realize is that both a Telemarketer and HouseFly share a common loosely interpreted behavior even though they are nothing alike in terms of modeling them. So, let's make an interface that both can implement:
interface IPest {
void BeAnnoying();
}
class HouseFly inherits Insect implements IPest {
void FlyAroundYourHead(){}
void LandOnThings(){}
void BeAnnoying() {
FlyAroundYourHead();
LandOnThings();
}
}
class Telemarketer inherits Person implements IPest {
void CallDuringDinner(){}
void ContinueTalkingWhenYouSayNo(){}
void BeAnnoying() {
CallDuringDinner();
ContinueTalkingWhenYouSayNo();
}
}
We now have two classes that can each be annoying in their own way. And they do not need to derive from the same base class and share common inherent characteristics -- they simply need to satisfy the contract of IPest -- that contract is simple. You just have to BeAnnoying. In this regard, we can model the following:
class DiningRoom {
DiningRoom(Person[] diningPeople, IPest[] pests) { ... }
void ServeDinner() {
when diningPeople are eating,
foreach pest in pests
pest.BeAnnoying();
}
}
Here we have a dining room that accepts a number of diners and a number of pests -- note the use of the interface. This means that in our little world, a member of the pests array could actually be a Telemarketer object or a HouseFly object.
The ServeDinner method is called when dinner is served and our people in the dining room are supposed to eat. In our little game, that's when our pests do their work -- each pest is instructed to be annoying by way of the IPest interface. In this way, we can easily have both Telemarketers and HouseFlys be annoying in each of their own ways -- we care only that we have something in the DiningRoom object that is a pest, we don't really care what it is and they could have nothing in common with other.
This very contrived pseudo-code example (that dragged on a lot longer than I anticipated) is simply meant to illustrate the kind of thing that finally turned the light on for me in terms of when we might use an interface. I apologize in advance for the silliness of the example, but hope that it helps in your understanding. And, to be sure, the other posted answers you've received here really cover the gamut of the use of interfaces today in design patterns and development methodologies.
The specific example I used to give to students is that they should write
List myList = new ArrayList(); // programming to the List interface
instead of
ArrayList myList = new ArrayList(); // this is bad
These look exactly the same in a short program, but if you go on to use myList 100 times in your program you can start to see a difference. The first declaration ensures that you only call methods on myList that are defined by the List interface (so no ArrayList specific methods). If you've programmed to the interface this way, later on you can decide that you really need
List myList = new TreeList();
and you only have to change your code in that one spot. You already know that the rest of your code doesn't do anything that will be broken by changing the implementation because you programmed to the interface.
The benefits are even more obvious (I think) when you're talking about method parameters and return values. Take this for example:
public ArrayList doSomething(HashMap map);
That method declaration ties you to two concrete implementations (ArrayList and HashMap). As soon as that method is called from other code, any changes to those types probably mean you're going to have to change the calling code as well. It would be better to program to the interfaces.
public List doSomething(Map map);
Now it doesn't matter what kind of List you return, or what kind of Map is passed in as a parameter. Changes that you make inside the doSomething method won't force you to change the calling code.
Programming to an interface is saying, "I need this functionality and I don't care where it comes from."
Consider (in Java), the List interface versus the ArrayList and LinkedList concrete classes. If all I care about is that I have a data structure containing multiple data items that I should access via iteration, I'd pick a List (and that's 99% of the time). If I know that I need constant-time insert/delete from either end of the list, I might pick the LinkedList concrete implementation (or more likely, use the Queue interface). If I know I need random access by index, I'd pick the ArrayList concrete class.
Programming to an interface has absolutely nothing to do with abstract interfaces like we see in Java or .NET. It isn't even an OOP concept.
What it means is don't go messing around with the internals of an object or data structure. Use the Abstract Program Interface, or API, to interact with your data. In Java or C# that means using public properties and methods instead of raw field access. For C that means using functions instead of raw pointers.
EDIT: And with databases it means using views and stored procedures instead of direct table access.
Using interfaces is a key factor in making your code easily testable in addition to removing unnecessary couplings between your classes. By creating an interface that defines the operations on your class, you allow classes that want to use that functionality the ability to use it without depending on your implementing class directly. If later on you decide to change and use a different implementation, you need only change the part of the code where the implementation is instantiated. The rest of the code need not change because it depends on the interface, not the implementing class.
This is very useful in creating unit tests. In the class under test you have it depend on the interface and inject an instance of the interface into the class (or a factory that allows it to build instances of the interface as needed) via the constructor or a property settor. The class uses the provided (or created) interface in its methods. When you go to write your tests, you can mock or fake the interface and provide an interface that responds with data configured in your unit test. You can do this because your class under test deals only with the interface, not your concrete implementation. Any class implementing the interface, including your mock or fake class, will do.
EDIT: Below is a link to an article where Erich Gamma discusses his quote, "Program to an interface, not an implementation."
http://www.artima.com/lejava/articles/designprinciples.html
You should look into Inversion of Control:
Martin Fowler: Inversion of Control Containers and the Dependency Injection pattern
Wikipedia: Inversion of Control
In such a scenario, you wouldn't write this:
IInterface classRef = new ObjectWhatever();
You would write something like this:
IInterface classRef = container.Resolve<IInterface>();
This would go into a rule-based setup in the container object, and construct the actual object for you, which could be ObjectWhatever. The important thing is that you could replace this rule with something that used another type of object altogether, and your code would still work.
If we leave IoC off the table, you can write code that knows that it can talk to an object that does something specific, but not which type of object or how it does it.
This would come in handy when passing parameters.
As for your parenthesized question "Also, how could you write a method that takes in an object that implements an Interface? Is that possible?", in C# you would simply use the interface type for the parameter type, like this:
public void DoSomethingToAnObject(IInterface whatever) { ... }
This plugs right into the "talk to an object that does something specific." The method defined above knows what to expect from the object, that it implements everything in IInterface, but it doesn't care which type of object it is, only that it adheres to the contract, which is what an interface is.
For instance, you're probably familiar with calculators and have probably used quite a few in your days, but most of the time they're all different. You, on the other hand, knows how a standard calculator should work, so you're able to use them all, even if you can't use the specific features that each calculator has that none of the other has.
This is the beauty of interfaces. You can write a piece of code, that knows that it will get objects passed to it that it can expect certain behavior from. It doesn't care one hoot what kind of object it is, only that it supports the behavior needed.
Let me give you a concrete example.
We have a custom-built translation system for windows forms. This system loops through controls on a form and translate text in each. The system knows how to handle basic controls, like the-type-of-control-that-has-a-Text-property, and similar basic stuff, but for anything basic, it falls short.
Now, since controls inherit from pre-defined classes that we have no control over, we could do one of three things:
Build support for our translation system to detect specifically which type of control it is working with, and translate the correct bits (maintenance nightmare)
Build support into base classes (impossible, since all the controls inherit from different pre-defined classes)
Add interface support
So we did nr. 3. All our controls implement ILocalizable, which is an interface that gives us one method, the ability to translate "itself" into a container of translation text/rules. As such, the form doesn't need to know which kind of control it has found, only that it implements the specific interface, and knows that there is a method where it can call to localize the control.
Code to the Interface Not the Implementation has NOTHING to do with Java, nor its Interface construct.
This concept was brought to prominence in the Patterns / Gang of Four books but was most probably around well before that. The concept certainly existed well before Java ever existed.
The Java Interface construct was created to aid in this idea (among other things), and people have become too focused on the construct as the centre of the meaning rather than the original intent. However, it is the reason we have public and private methods and attributes in Java, C++, C#, etc.
It means just interact with an object or system's public interface. Don't worry or even anticipate how it does what it does internally. Don't worry about how it is implemented. In object-oriented code, it is why we have public vs. private methods/attributes. We are intended to use the public methods because the private methods are there only for use internally, within the class. They make up the implementation of the class and can be changed as required without changing the public interface. Assume that regarding functionality, a method on a class will perform the same operation with the same expected result every time you call it with the same parameters. It allows the author to change how the class works, its implementation, without breaking how people interact with it.
And you can program to the interface, not the implementation without ever using an Interface construct. You can program to the interface not the implementation in C++, which does not have an Interface construct. You can integrate two massive enterprise systems much more robustly as long as they interact through public interfaces (contracts) rather than calling methods on objects internal to the systems. The interfaces are expected to always react the same expected way given the same input parameters; if implemented to the interface and not the implementation. The concept works in many places.
Shake the thought that Java Interfaces have anything what-so-ever to do with the concept of 'Program to the Interface, Not the Implementation'. They can help apply the concept, but they are not the concept.
It sounds like you understand how interfaces work but are unsure of when to use them and what advantages they offer. Here are a few examples of when an interface would make sense:
// if I want to add search capabilities to my application and support multiple search
// engines such as Google, Yahoo, Live, etc.
interface ISearchProvider
{
string Search(string keywords);
}
then I could create GoogleSearchProvider, YahooSearchProvider, LiveSearchProvider, etc.
// if I want to support multiple downloads using different protocols
// HTTP, HTTPS, FTP, FTPS, etc.
interface IUrlDownload
{
void Download(string url)
}
// how about an image loader for different kinds of images JPG, GIF, PNG, etc.
interface IImageLoader
{
Bitmap LoadImage(string filename)
}
then create JpegImageLoader, GifImageLoader, PngImageLoader, etc.
Most add-ins and plugin systems work off interfaces.
Another popular use is for the Repository pattern. Say I want to load a list of zip codes from different sources
interface IZipCodeRepository
{
IList<ZipCode> GetZipCodes(string state);
}
then I could create an XMLZipCodeRepository, SQLZipCodeRepository, CSVZipCodeRepository, etc. For my web applications, I often create XML repositories early on so I can get something up and running before the SQL Database is ready. Once the database is ready I write an SQLRepository to replace the XML version. The rest of my code remains unchanged since it runs solely off of interfaces.
Methods can accept interfaces such as:
PrintZipCodes(IZipCodeRepository zipCodeRepository, string state)
{
foreach (ZipCode zipCode in zipCodeRepository.GetZipCodes(state))
{
Console.WriteLine(zipCode.ToString());
}
}
It makes your code a lot more extensible and easier to maintain when you have sets of similar classes. I am a junior programmer, so I am no expert, but I just finished a project that required something similar.
I work on client side software that talks to a server running a medical device. We are developing a new version of this device that has some new components that the customer must configure at times. There are two types of new components, and they are different, but they are also very similar. Basically, I had to create two config forms, two lists classes, two of everything.
I decided that it would be best to create an abstract base class for each control type that would hold almost all of the real logic, and then derived types to take care of the differences between the two components. However, the base classes would not have been able to perform operations on these components if I had to worry about types all of the time (well, they could have, but there would have been an "if" statement or switch in every method).
I defined a simple interface for these components and all of the base classes talk to this interface. Now when I change something, it pretty much 'just works' everywhere and I have no code duplication.
A lot of explanation out there, but to make it even more simpler. Take for instance a List. One can implement a list with as:
An internal array
A linked list
Other implementations
By building to an interface, say a List. You only code as to definition of List or what List means in reality.
You could use any type of implementation internally say an array implementation. But suppose you wish to change the implementation for some reason say a bug or performance. Then you just have to change the declaration List<String> ls = new ArrayList<String>() to List<String> ls = new LinkedList<String>().
Nowhere else in code, will you have to change anything else; Because everything else was built on the definition of List.
If you program in Java, JDBC is a good example. JDBC defines a set of interfaces but says nothing about the implementation. Your applications can be written against this set of interfaces. In theory, you pick some JDBC driver and your application would just work. If you discover there's a faster or "better" or cheaper JDBC driver or for whatever reason, you can again in theory re-configure your property file, and without having to make any change in your application, your application would still work.
I am a late comer to this question, but I want to mention here that the line "Program to an interface, not an implementation" had some good discussion in the GoF (Gang of Four) Design Patterns book.
It stated, on p. 18:
Program to an interface, not an implementation
Don't declare variables to be instances of particular concrete classes. Instead, commit only to an interface defined by an abstract class. You will find this to be a common theme of the design patterns in this book.
and above that, it began with:
There are two benefits to manipulating objects solely in terms of the interface defined by abstract classes:
Clients remain unaware of the specific types of objects they use, as long as the objects adhere to the interface that clients expect.
Clients remain unaware of the classes that implement these objects. Clients only know about the abstract class(es) defining the interface.
So in other words, don't write it your classes so that it has a quack() method for ducks, and then a bark() method for dogs, because they are too specific for a particular implementation of a class (or subclass). Instead, write the method using names that are general enough to be used in the base class, such as giveSound() or move(), so that they can be used for ducks, dogs, or even cars, and then the client of your classes can just say .giveSound() rather than thinking about whether to use quack() or bark() or even determine the type before issuing the correct message to be sent to the object.
Programming to Interfaces is awesome, it promotes loose coupling. As #lassevk mentioned, Inversion of Control is a great use of this.
In addition, look into SOLID principals. here is a video series
It goes through a hard coded (strongly coupled example) then looks at interfaces, finally progressing to a IoC/DI tool (NInject)
To add to the existing posts, sometimes coding to interfaces helps on large projects when developers work on separate components simultaneously. All you need is to define interfaces upfront and write code to them while other developers write code to the interface you are implementing.
It can be advantageous to program to interfaces, even when we are not depending on abstractions.
Programming to interfaces forces us to use a contextually appropriate subset of an object. That helps because it:
prevents us from doing contextually inappropriate things, and
lets us safely change the implementation in the future.
For example, consider a Person class that implements the Friend and the Employee interface.
class Person implements AbstractEmployee, AbstractFriend {
}
In the context of the person's birthday, we program to the Friend interface, to prevent treating the person like an Employee.
function party() {
const friend: Friend = new Person("Kathryn");
friend.HaveFun();
}
In the context of the person's work, we program to the Employee interface, to prevent blurring workplace boundaries.
function workplace() {
const employee: Employee = new Person("Kathryn");
employee.DoWork();
}
Great. We have behaved appropriately in different contexts, and our software is working well.
Far into the future, if our business changes to work with dogs, we can change the software fairly easily. First, we create a Dog class that implements both Friend and Employee. Then, we safely change new Person() to new Dog(). Even if both functions have thousands of lines of code, that simple edit will work because we know the following are true:
Function party uses only the Friend subset of Person.
Function workplace uses only the Employee subset of Person.
Class Dog implements both the Friend and Employee interfaces.
On the other hand, if either party or workplace were to have programmed against Person, there would be a risk of both having Person-specific code. Changing from Person to Dog would require us to comb through the code to extirpate any Person-specific code that Dog does not support.
The moral: programming to interfaces helps our code to behave appropriately and to be ready for change. It also prepares our code to depend on abstractions, which brings even more advantages.
If I'm writing a new class Swimmer to add the functionality swim() and need to use an object of class say Dog, and this Dog class implements interface Animal which declares swim().
At the top of the hierarchy (Animal), it's very abstract while at the bottom (Dog) it's very concrete. The way I think about "programming to interfaces" is that, as I write Swimmer class, I want to write my code against the interface that's as far up that hierarchy which in this case is an Animal object. An interface is free from implementation details and thus makes your code loosely-coupled.
The implementation details can be changed with time, however, it would not affect the remaining code since all you are interacting with is with the interface and not the implementation. You don't care what the implementation is like... all you know is that there will be a class that would implement the interface.
It is also good for Unit Testing, you can inject your own classes (that meet the requirements of the interface) into a class that depends on it
Short story: A postman is asked to go home after home and receive the covers contains (letters, documents, cheques, gift cards, application, love letter) with the address written on it to deliver.
Suppose there is no cover and ask the postman to go home after home and receive all the things and deliver to other people, the postman can get confused.
So better wrap it with cover (in our story it is the interface) then he will do his job fine.
Now the postman's job is to receive and deliver the covers only (he wouldn't bothered what is inside in the cover).
Create a type of interface not actual type, but implement it with actual type.
To create to interface means your components get Fit into the rest of code easily
I give you an example.
you have the AirPlane interface as below.
interface Airplane{
parkPlane();
servicePlane();
}
Suppose you have methods in your Controller class of Planes like
parkPlane(Airplane plane)
and
servicePlane(Airplane plane)
implemented in your program. It will not BREAK your code.
I mean, it need not to change as long as it accepts arguments as AirPlane.
Because it will accept any Airplane despite actual type, flyer, highflyr, fighter, etc.
Also, in a collection:
List<Airplane> plane; // Will take all your planes.
The following example will clear your understanding.
You have a fighter plane that implements it, so
public class Fighter implements Airplane {
public void parkPlane(){
// Specific implementations for fighter plane to park
}
public void servicePlane(){
// Specific implementatoins for fighter plane to service.
}
}
The same thing for HighFlyer and other clasess:
public class HighFlyer implements Airplane {
public void parkPlane(){
// Specific implementations for HighFlyer plane to park
}
public void servicePlane(){
// specific implementatoins for HighFlyer plane to service.
}
}
Now think your controller classes using AirPlane several times,
Suppose your Controller class is ControlPlane like below,
public Class ControlPlane{
AirPlane plane;
// so much method with AirPlane reference are used here...
}
Here magic comes as you may make your new AirPlane type instances as many as you want and you are not changing the code of ControlPlane class.
You can add an instance...
JumboJetPlane // implementing AirPlane interface.
AirBus // implementing AirPlane interface.
You may remove instances of previously created types too.
So, just to get this right, the advantage of a interface is that I can separate the calling of a method from any particular class. Instead creating a instance of the interface, where the implementation is given from whichever class I choose that implements that interface. Thus allowing me to have many classes, which have similar but slightly different functionality and in some cases (the cases related to the intention of the interface) not care which object it is.
For example, I could have a movement interface. A method which makes something 'move' and any object (Person, Car, Cat) that implements the movement interface could be passed in and told to move. Without the method every knowing the type of class it is.
Imagine you have a product called 'Zebra' that can be extended by plugins. It finds the plugins by searching for DLLs in some directory. It loads all those DLLs and uses reflection to find any classes that implement IZebraPlugin, and then calls the methods of that interface to communicate with the plugins.
This makes it completely independent of any specific plugin class - it doesn't care what the classes are. It only cares that they fulfill the interface specification.
Interfaces are a way of defining points of extensibility like this. Code that talks to an interface is more loosely coupled - in fact it is not coupled at all to any other specific code. It can inter-operate with plugins written years later by people who have never met the original developer.
You could instead use a base class with virtual functions - all plugins would be derived from the base class. But this is much more limiting because a class can only have one base class, whereas it can implement any number of interfaces.
C++ explanation.
Think of an interface as your classes public methods.
You then could create a template that 'depends' on these public methods in order to carry out it's own function (it makes function calls defined in the classes public interface). Lets say this template is a container, like a Vector class, and the interface it depends on is a search algorithm.
Any algorithm class that defines the functions/interface Vector makes calls to will satisfy the 'contract' (as someone explained in the original reply). The algorithms don't even need to be of the same base class; the only requirement is that the functions/methods that the Vector depends on (interface) is defined in your algorithm.
The point of all of this is that you could supply any different search algorithm/class just as long as it supplied the interface that Vector depends on (bubble search, sequential search, quick search).
You might also want to design other containers (lists, queues) that would harness the same search algorithm as Vector by having them fulfill the interface/contract that your search algorithms depends on.
This saves time (OOP principle 'code reuse') as you are able to write an algorithm once instead of again and again and again specific to every new object you create without over-complicating the issue with an overgrown inheritance tree.
As for 'missing out' on how things operate; big-time (at least in C++), as this is how most of the Standard TEMPLATE Library's framework operates.
Of course when using inheritance and abstract classes the methodology of programming to an interface changes; but the principle is the same, your public functions/methods are your classes interface.
This is a huge topic and one of the the cornerstone principles of Design Patterns.
In Java these concrete classes all implement the CharSequence interface:
CharBuffer, String, StringBuffer, StringBuilder
These concrete classes do not have a common parent class other than Object, so there is nothing that relates them, other than the fact they each have something to do with arrays of characters, representing such, or manipulating such. For instance, the characters of String cannot be changed once a String object is instantiated, whereas the characters of StringBuffer or StringBuilder can be edited.
Yet each one of these classes is capable of suitably implementing the CharSequence interface methods:
char charAt(int index)
int length()
CharSequence subSequence(int start, int end)
String toString()
In some cases, Java class library classes that used to accept String have been revised to now accept the CharSequence interface. So if you have an instance of StringBuilder, instead of extracting a String object (which means instantiating a new object instance), it can instead just pass the StringBuilder itself as it implements the CharSequence interface.
The Appendable interface that some classes implement has much the same kind of benefit for any situation where characters can be appended to an instance of the underlying concrete class object instance. All of these concrete classes implement the Appendable interface:
BufferedWriter, CharArrayWriter, CharBuffer, FileWriter, FilterWriter, LogStream, OutputStreamWriter, PipedWriter, PrintStream, PrintWriter, StringBuffer, StringBuilder, StringWriter, Writer
Previous answers focus on programming to an abstraction for the sake of extensibility and loose coupling. While these are very important points,
readability is equally important. Readability allows others (and your future self) to understand the code with minimal effort. This is why readability leverages abstractions.
An abstraction is, by definition, simpler than its implementation. An abstraction omits detail in order to convey the essence or purpose of a thing, but nothing more.
Because abstractions are simpler, I can fit a lot more of them in my head at one time, compared to implementations.
As a programmer (in any language) I walk around with a general idea of a List in my head at all times. In particular, a List allows random access, duplicate elements, and maintains order. When I see a declaration like this: List myList = new ArrayList() I think, cool, this is a List that's being used in the (basic) way that I understand; and I don't have to think any more about it.
On the other hand, I do not carry around the specific implementation details of ArrayList in my head. So when I see, ArrayList myList = new ArrayList(). I think, uh-oh, this ArrayList must be used in a way that isn't covered by the List interface. Now I have to track down all the usages of this ArrayList to understand why, because otherwise I won't be able to fully understand this code. It gets even more confusing when I discover that 100% of the usages of this ArrayList do conform to the List interface. Then I'm left wondering... was there some code relying on ArrayList implementation details that got deleted? Was the programmer who instantiated it just incompetent? Is this application locked into that specific implementation in some way at runtime? A way that I don't understand?
I'm now confused and uncertain about this application, and all we're talking about is a simple List. What if this was a complex business object ignoring its interface? Then my knowledge of the business domain is insufficient to understand the purpose of the code.
So even when I need a List strictly within a private method (nothing that would break other applications if it changed, and I could easily find/replace every usage in my IDE) it still benefits readability to program to an abstraction. Because abstractions are simpler than implementation details. You could say that programming to abstractions is one way of adhering to the KISS principle.
An interface is like a contract, where you want your implementation class to implement methods written in the contract (interface). Since Java does not provide multiple inheritance, "programming to interface" is a good way to achieve multiple inheritance.
If you have a class A that is already extending some other class B, but you want that class A to also follow certain guidelines or implement a certain contract, then you can do so by the "programming to interface" strategy.
Q: - ... "Could you use any class that implements an interface?"
A: - Yes.
Q: - ... "When would you need to do that?"
A: - Each time you need a class(es) that implements interface(s).
Note: We couldn't instantiate an interface not implemented by a class - True.
Why?
Because the interface has only method prototypes, not definitions (just functions names, not their logic)
AnIntf anInst = new Aclass();
// we could do this only if Aclass implements AnIntf.
// anInst will have Aclass reference.
Note: Now we could understand what happened if Bclass and Cclass implemented same Dintf.
Dintf bInst = new Bclass();
// now we could call all Dintf functions implemented (defined) in Bclass.
Dintf cInst = new Cclass();
// now we could call all Dintf functions implemented (defined) in Cclass.
What we have: Same interface prototypes (functions names in interface), and call different implementations.
Bibliography:
Prototypes - wikipedia
program to an interface is a term from the GOF book. i would not directly say it has to do with java interface but rather real interfaces. to achieve clean layer separation, you need to create some separation between systems for example: Let's say you had a concrete database you want to use, you would never "program to the database" , instead you would "program to the storage interface". Likewise you would never "program to a Web Service" but rather you would program to a "client interface". this is so you can easily swap things out.
i find these rules help me:
1. we use a java interface when we have multiple types of an object. if i just have single object, i dont see the point. if there are at least two concrete implementations of some idea, then i would use a java interface.
2. if as i stated above, you want to bring decoupling from an external system (storage system) to your own system (local DB) then also use a interface.
notice how there are two ways to consider when to use them.
Coding to an interface is a philosophy, rather than specific language constructs or design patterns - it instructs you what is the correct order of steps to follow in order to create better software systems (e.g. more resilient, more testable, more scalable, more extendible, and other nice traits).
What it actually means is:
===
Before jumping to implementations and coding (the HOW) - think of the WHAT:
What black boxes should make up your system,
What is each box' responsibility,
What are the ways each "client" (that is, one of those other boxes, 3rd party "boxes", or even humans) should communicate with it (the API of each box).
After you figure the above, go ahead and implement those boxes (the HOW).
Thinking first of what a box' is and what its API, leads the developer to distil the box' responsibility, and to mark for himself and future developers the difference between what is its exposed details ("API") and it's hidden details ("implementation details"), which is a very important differentiation to have.
One immediate and easily noticeable gain is the team can then change and improve implementations without affecting the general architecture. It also makes the system MUCH more testable (it goes well with the TDD approach).
===
Beyond the traits I've mentioned above, you also save A LOT OF TIME going this direction.
Micro Services and DDD, when done right, are great examples of "Coding to an interface", however the concept wins in every pattern from monoliths to "serverless", from BE to FE, from OOP to functional, etc....
I strongly recommend this approach for Software Engineering (and I basically believe it makes total sense in other fields as well).
Program to an interface allows to change implementation of contract defined by interface seamlessly. It allows loose coupling between contract and specific implementations.
IInterface classRef = new ObjectWhatever()
You could use any class that implements IInterface? When would you need to do that?
Have a look at this SE question for good example.
Why should the interface for a Java class be preferred?
does using an Interface hit performance?
if so how much?
Yes. It will have slight performance overhead in sub-seconds. But if your application has requirement to change the implementation of interface dynamically, don't worry about performance impact.
how can you avoid it without having to maintain two bits of code?
Don't try to avoid multiple implementations of interface if your application need them. In absence of tight coupling of interface with one specific implementation, you may have to deploy the patch to change one implementation to other implementation.
One good use case: Implementation of Strategy pattern:
Real World Example of the Strategy Pattern
"Program to interface" means don't provide hard code right the way, meaning your code should be extended without breaking the previous functionality. Just extensions, not editing the previous code.
Also I see a lot of good and explanatory answers here, so I want to give my point of view here, including some extra information what I noticed when using this method.
Unit testing
For the last two years, I have written a hobby project and I did not write unit tests for it. After writing about 50K lines I found out it would be really necessary to write unit tests.
I did not use interfaces (or very sparingly) ... and when I made my first unit test, I found out it was complicated. Why?
Because I had to make a lot of class instances, used for input as class variables and/or parameters. So the tests look more like integration tests (having to make a complete 'framework' of classes since all was tied together).
Fear of interfaces
So I decided to use interfaces. My fear was that I had to implement all functionality everywhere (in all used classes) multiple times. In some way this is true, however, by using inheritance it can be reduced a lot.
Combination of interfaces and inheritance
I found out the combination is very good to be used. I give a very simple example.
public interface IPricable
{
int Price { get; }
}
public interface ICar : IPricable
public abstract class Article
{
public int Price { get { return ... } }
}
public class Car : Article, ICar
{
// Price does not need to be defined here
}
This way copying code is not necessary, while still having the benefit of using a car as interface (ICar).
I have some questions about the affects of using concrete classes and interfaces.
Say some chunk of code (call it chunkCode) uses concrete class A. Would I have to re-compile chunkCode if:
I add some new public methods to A? If so, isn't that a bit stange? After all I still provide the interface chunkCode relies on. (Or do I have to re-compile because chunkCode may never know otherwise that this is true and I haven't omitted some API)
I add some new private methods to A?
I add a new public field to A?
I add a new private field to A?
Factory Design Pattern:
The main code doesn't care what the concrete type of the object is. It relies only on the API. But what would you do if there are few methods which are relevant to only one concrete type? This type implements the interface but adds some more public methods? Would you use some if (A is type1) statements (or the like) the main code?
Thanks for any clarification
1) Compiling is not an activity in OO. It is a detail of specific OO implementations. If you want an answer for a specific implementation (e.g. Java), then you need to clarify.
In general, some would say that adding to an interface is not considered a breaking change, wheras others say you cannot change an interface once it is published, and you have to create a new interface.
Edit: You specified C#, so check out this question regarding breaking changes in .Net. I don't want to do that answer a disservice, so I won't try to replicate it here.
2) People often hack their designs to do this, but it is a sign that you have a poor design.
Good alternatives:
Create a method in your interface that allows you to invoke the custom behavior, but not be required to know what that behavior is.
Create an additional interface (and a new factory) that supports the new methods. The new interface does not have to inherit the old interface, but it can if it makes sense (if an is-a relationship can be expressed between the interfaces).
If your language supports it, use the Abstract Factory pattern, and take advantage of Covariant Return Types in the concrete factory. If you need a specific derived type, accept a concrete factory instead of an abstract one.
Bad alternatives (anti-patterns):
Adding a method to the interface that does nothing in other derived classed.
Throwing an exception in a method that doesn't make sense for your derived class.
Adding query methods to the interface that tell the user if they can call a certain method.
Unless the method name is generic enough that the user wouldn't expect it to do anything (e.g. DoExtraProcessing), then adding a method that is no-op in most derived classes breaks the contract defined by that interface.
E.g.: Someone invoking bird.Fly() would expect it to actually do something. We know that chickens can't fly. So either a Chicken isn't a Bird, or Birds don't Fly.
Adding query methods is a poor work-around for this. E.g. Adding a boolean CanFly() method or property in your interface. So is throwing an exception. Neither of them get around the fact that the type simply isn't substitutable. Check out the Liskov Substitution Principle (LSP).
For your first question the answer is NO for all your points. If it would be that way then backward compatibility would not make any sense. You have to recompile chunkCode only if you brake the API, that is remove some functionality that chunkCode is using, changing calling conventions, modifying number of parameters, these sort of things == breaking changes.
For the second I usually, but only if I really have to, use dynamic_cast in those situations.
Note my answer is valid in the context of C++;I just saw the question is language agnostic(kind of tired at this hour; I'll remove the answer if it offenses anybody).
Question 1: Depends on what language you are talking about. Its always safer to recompile both languages though. Mostly because chuckCode does not know what actually exists inside A. Recompiling refreshes its memory. But it should work in Java without recompiling.
Question 2: No. The entire point of writing a Factory is to get rid of if(A is type1). These if statements are terrible from maintenance perspective.
Factory is designed to build objects of similar type. If you are having a situation where you are using this statement then that object is either not a similar type to rest of the classes. If you are sure it is of similar type and have similar interfaces. I would write an extra function in all the concrete base classes and implement it only on this one.
Ideally All these concrete classes should have a common abstract base class or a Interface to define what the API is. Nothing other than what is designed in this Interface should be expected to be called anywhere in the code unless you are writing functions that takes this specific class.
I've always been taught that if you are doing something to an object, that should be an external thing, so one would Save(Class) rather than having the object save itself: Class.Save().
I've noticed that in the .Net libraries, it is common to have a class modify itself as with String.Format() or sort itself as with List.Sort().
My question is, in strict OOP is it appropriate to have a class which performs functions on itself when called to do so, or should such functions be external and called on an object of the class' type?
Great question. I have just recently reflected on a very similar issue and was eventually going to ask much the same thing here on SO.
In OOP textbooks, you sometimes see examples such as Dog.Bark(), or Person.SayHello(). I have come to the conclusion that those are bad examples. When you call those methods, you make a dog bark, or a person say hello. However, in the real world, you couldn't do this; a dog decides himself when it's going to bark. A person decides itself when it will say hello to someone. Therefore, these methods would more appropriately be modelled as events (where supported by the programming language).
You would e.g. have a function Attack(Dog), PlayWith(Dog), or Greet(Person) which would trigger the appropriate events.
Attack(dog) // triggers the Dog.Bark event
Greet(johnDoe) // triggers the Person.SaysHello event
As soon as you have more than one parameter, it won't be so easy deciding how to best write the code. Let's say I want to store a new item, say an integer, into a collection. There's many ways to formulate this; for example:
StoreInto(1, collection) // the "classic" procedural approach
1.StoreInto(collection) // possible in .NET with extension methods
Store(1).Into(collection) // possible by using state-keeping temporary objects
According to the thinking laid out above, the last variant would be the preferred one, because it doesn't force an object (the 1) to do something to itself. However, if you follow that programming style, it will soon become clear that this fluent interface-like code is quite verbose, and while it's easy to read, it can be tiring to write or even hard to remember the exact syntax.
P.S.: Concerning global functions: In the case of .NET (which you mentioned in your question), you don't have much choice, since the .NET languages do not provide for global functions. I think these would be technically possible with the CLI, but the languages disallow that feature. F# has global functions, but they can only be used from C# or VB.NET when they are packed into a module. I believe Java also doesn't have global functions.
I have come across scenarios where this lack is a pity (e.g. with fluent interface implementations). But generally, we're probably better off without global functions, as some developers might always fall back into old habits, and leave a procedural codebase for an OOP developer to maintain. Yikes.
Btw., in VB.NET, however, you can mimick global functions by using modules. Example:
Globals.vb:
Module Globals
Public Sub Save(ByVal obj As SomeClass)
...
End Sub
End Module
Demo.vb:
Imports Globals
...
Dim obj As SomeClass = ...
Save(obj)
I guess the answer is "It Depends"... for Persistence of an object I would side with having that behavior defined within a separate repository object. So with your Save() example I might have this:
repository.Save(class)
However with an Airplane object you may want the class to know how to fly with a method like so:
airplane.Fly()
This is one of the examples I've seen from Fowler about an aenemic data model. I don't think in this case you would want to have a separate service like this:
new airplaneService().Fly(airplane)
With static methods and extension methods it makes a ton of sense like in your List.Sort() example. So it depends on your usage pattens. You wouldn't want to have to new up an instance of a ListSorter class just to be able to sort a list like this:
new listSorter().Sort(list)
In strict OOP (Smalltalk or Ruby), all methods belong to an instance object or a class object. In "real" OOP (like C++ or C#), you will have static methods that essentially stand completely on their own.
Going back to strict OOP, I'm more familiar with Ruby, and Ruby has several "pairs" of methods that either return a modified copy or return the object in place -- a method ending with a ! indicates that the message modifies its receiver. For instance:
>> s = 'hello'
=> "hello"
>> s.reverse
=> "olleh"
>> s
=> "hello"
>> s.reverse!
=> "olleh"
>> s
=> "olleh"
The key is to find some middle ground between pure OOP and pure procedural that works for what you need to do. A Class should do only one thing (and do it well). Most of the time, that won't include saving itself to disk, but that doesn't mean Class shouldn't know how to serialize itself to a stream, for instance.
I'm not sure what distinction you seem to be drawing when you say "doing something to an object". In many if not most cases, the class itself is the best place to define its operations, as under "strict OOP" it is the only code that has access to internal state on which those operations depend (information hiding, encapsulation, ...).
That said, if you have an operation which applies to several otherwise unrelated types, then it might make sense for each type to expose an interface which lets the operation do most of the work in a more or less standard way. To tie it in to your example, several classes might implement an interface ISaveable which exposes a Save method on each. Individual Save methods take advantage of their access to internal class state, but given a collection of ISaveable instances, some external code could define an operation for saving them to a custom store of some kind without having to know the messy details.
It depends on what information is needed to do the work. If the work is unrelated to the class (mostly equivalently, can be made to work on virtually any class with a common interface), for example, std::sort, then make it a free function. If it must know the internals, make it a member function.
Edit: Another important consideration is performance. In-place sorting, for example, can be miles faster than returning a new, sorted, copy. This is why quicksort is faster than merge sort in the vast majority of cases, even though merge sort is theoretically faster, which is because quicksort can be performed in-place, whereas I've never heard of an in-place merge-sort. Just because it's technically possible to perform an operation within the class's public interface, doesn't mean that you actually should.
A friend of mine goes back and forth on what "interface" means in programming.
What is the best description of an "interface"?
To me, an interface is a blueprint of a class. Is this the best definition?
An interface is one of the more overloaded and confusing terms in development.
It is actually a concept of abstraction and encapsulation. For a given "box", it declares the "inputs" and "outputs" of that box. In the world of software, that usually means the operations that can be invoked on the box (along with arguments) and in some cases the return types of these operations.
What it does not do is define what the semantics of these operations are, although it is commonplace (and very good practice) to document them in proximity to the declaration (e.g., via comments), or to pick good naming conventions. Nevertheless, there are no guarantees that these intentions would be followed.
Here is an analogy: Take a look at your television when it is off. Its interface are the buttons it has, the various plugs, and the screen. Its semantics and behavior are that it takes inputs (e.g., cable programming) and has outputs (display on the screen, sound, etc.). However, when you look at a TV that is not plugged in, you are projecting your expected semantics into an interface. For all you know, the TV could just explode when you plug it in. However, based on its "interface" you can assume that it won't make any coffee since it doesn't have a water intake.
In object oriented programming, an interface generally defines the set of methods (or messages) that an instance of a class that has that interface could respond to.
What adds to the confusion is that in some languages, like Java, there is an actual interface with its language specific semantics. In Java, for example, it is a set of method declarations, with no implementation, but an interface also corresponds to a type and obeys various typing rules.
In other languages, like C++, you do not have interfaces. A class itself defines methods, but you could think of the interface of the class as the declarations of the non-private methods. Because of how C++ compiles, you get header files where you could have the "interface" of the class without actual implementation. You could also mimic Java interfaces with abstract classes with pure virtual functions, etc.
An interface is most certainly not a blueprint for a class. A blueprint, by one definition is a "detailed plan of action". An interface promises nothing about an action! The source of the confusion is that in most languages, if you have an interface type that defines a set of methods, the class that implements it "repeats" the same methods (but provides definition), so the interface looks like a skeleton or an outline of the class.
Consider the following situation:
You are in the middle of a large, empty room, when a zombie suddenly attacks you.
You have no weapon.
Luckily, a fellow living human is standing in the doorway of the room.
"Quick!" you shout at him. "Throw me something I can hit the zombie with!"
Now consider:
You didn't specify (nor do you care) exactly what your friend will choose to toss;
...But it doesn't matter, as long as:
It's something that can be tossed (He can't toss you the sofa)
It's something that you can grab hold of (Let's hope he didn't toss a shuriken)
It's something you can use to bash the zombie's brains out (That rules out pillows and such)
It doesn't matter whether you get a baseball bat or a hammer -
as long as it implements your three conditions, you're good.
To sum it up:
When you write an interface, you're basically saying: "I need something that..."
Interface is a contract you should comply to or given to, depending if you are implementer or a user.
I don't think "blueprint" is a good word to use. A blueprint tells you how to build something. An interface specifically avoids telling you how to build something.
An interface defines how you can interact with a class, i.e. what methods it supports.
In Programming, an interface defines what the behavior a an object will have, but it will not actually specify the behavior. It is a contract, that will guarantee, that a certain class can do something.
Consider this piece of C# code here:
using System;
public interface IGenerate
{
int Generate();
}
// Dependencies
public class KnownNumber : IGenerate
{
public int Generate()
{
return 5;
}
}
public class SecretNumber : IGenerate
{
public int Generate()
{
return new Random().Next(0, 10);
}
}
// What you care about
class Game
{
public Game(IGenerate generator)
{
Console.WriteLine(generator.Generate())
}
}
new Game(new SecretNumber());
new Game(new KnownNumber());
The Game class requires a secret number. For the sake of testing it, you would like to inject what will be used as a secret number (this principle is called Inversion of Control).
The game class wants to be "open minded" about what will actually create the random number, therefore it will ask in its constructor for "anything, that has a Generate method".
First, the interface specifies, what operations an object will provide. It just contains what it looks like, but no actual implementation is given. This is just the signature of the method. Conventionally, in C# interfaces are prefixed with an I.
The classes now implement the IGenerate Interface. This means that the compiler will make sure, that they both have a method, that returns an int and is called Generate.
The game now is being called two different object, each of which implementant the correct interface. Other classes would produce an error upon building the code.
Here I noticed the blueprint analogy you used:
A class is commonly seen as a blueprint for an object. An Interface specifies something that a class will need to do, so one could argue that it indeed is just a blueprint for a class, but since a class does not necessarily need an interface, I would argue that this metaphor is breaking. Think of an interface as a contract. The class that "signs it" will be legally required (enforced by the compiler police), to comply to the terms and conditions in the contract. This means that it will have to do, what is specified in the interface.
This is all due to the statically typed nature of some OO languages, as it is the case with Java or C#. In Python on the other hand, another mechanism is used:
import random
# Dependencies
class KnownNumber(object):
def generate(self):
return 5
class SecretNumber(object):
def generate(self):
return random.randint(0,10)
# What you care about
class SecretGame(object):
def __init__(self, number_generator):
number = number_generator.generate()
print number
Here, none of the classes implement an interface. Python does not care about that, because the SecretGame class will just try to call whatever object is passed in. If the object HAS a generate() method, everything is fine. If it doesn't: KAPUTT!
This mistake will not be seen at compile time, but at runtime, so possibly when your program is already deployed and running. C# would notify you way before you came close to that.
The reason this mechanism is used, naively stated, because in OO languages naturally functions aren't first class citizens. As you can see, KnownNumber and SecretNumber contain JUST the functions to generate a number. One does not really need the classes at all. In Python, therefore, one could just throw them away and pick the functions on their own:
# OO Approach
SecretGame(SecretNumber())
SecretGame(KnownNumber())
# Functional Approach
# Dependencies
class SecretGame(object):
def __init__(self, generate):
number = generate()
print number
SecretGame(lambda: random.randint(0,10))
SecretGame(lambda: 5)
A lambda is just a function, that was declared "in line, as you go".
A delegate is just the same in C#:
class Game
{
public Game(Func<int> generate)
{
Console.WriteLine(generate())
}
}
new Game(() => 5);
new Game(() => new Random().Next(0, 10));
Side note: The latter examples were not possible like this up to Java 7. There, Interfaces were your only way of specifying this behavior. However, Java 8 introduced lambda expressions so the C# example can be converted to Java very easily (Func<int> becomes java.util.function.IntSupplier and => becomes ->).
To me an interface is a blueprint of a class, is this the best definition?
No. A blueprint typically includes the internals. But a interface is purely about what is visible on the outside of a class ... or more accurately, a family of classes that implement the interface.
The interface consists of the signatures of methods and values of constants, and also a (typically informal) "behavioral contract" between classes that implement the interface and others that use it.
Technically, I would describe an interface as a set of ways (methods, properties, accessors... the vocabulary depends on the language you are using) to interact with an object. If an object supports/implements an interface, then you can use all of the ways specified in the interface to interact with this object.
Semantically, an interface could also contain conventions about what you may or may not do (e.g., the order in which you may call the methods) and about what, in return, you may assume about the state of the object given how you interacted so far.
Personally I see an interface like a template. If a interface contains the definition for the methods foo() and bar(), then you know every class which uses this interface has the methods foo() and bar().
Let us consider a Man(User or an Object) wants some work to be done. He will contact a middle man(Interface) who will be having a contract with the companies(real world objects created using implemented classes). Few types of works will be defined by him which companies will implement and give him results.
Each and every company will implement the work in its own way but the result will be same. Like this User will get its work done using an single interface.
I think Interface will act as visible part of the systems with few commands which will be defined internally by the implementing inner sub systems.
An interface separates out operations on a class from the implementation within. Thus, some implementations may provide for many interfaces.
People would usually describe it as a "contract" for what must be available in the methods of the class.
It is absolutely not a blueprint, since that would also determine implementation. A full class definition could be said to be a blueprint.
An interface defines what a class that inherits from it must implement. In this way, multiple classes can inherit from an interface, and because of that inherticance, you can
be sure that all members of the interface are implemented in the derived class (even if its just to throw an exception)
Abstract away the class itself from the caller (cast an instance of a class to the interface, and interact with it without needing to know what the actual derived class IS)
for more info, see this http://msdn.microsoft.com/en-us/library/ms173156.aspx
In my opinion, interface has a broader meaning than the one commonly associated with it in Java. I would define "interface" as a set of available operations with some common functionality, that allow controlling/monitoring a module.
In this definition I try to cover both programatic interfaces, where the client is some module, and human interfaces (GUI for example).
As others already said, an interface always has some contract behind it, in terms of inputs and outputs. The interface does not promise anything about the "how" of the operations; it only guarantees some properties of the outcome, given the current state, the selected operation and its parameters.
As above, synonyms of "contract" and "protocol" are appropriate.
The interface comprises the methods and properties you can expect to be exposed by a class.
So if a class Cheetos Bag implements the Chip Bag interface, you should expect a Cheetos Bag to behave exactly like any other Chip Bag. (That is, expose the .attemptToOpenWithoutSpillingEverywhere() method, etc.)
A boundary across which two systems communicate.
Interfaces are how some OO languages achieve ad hoc polymorphism. Ad hoc polymorphism is simply functions with the same names operating on different types.
Conventional Definition - An interface is a contract that specifies the methods which needs to be implemented by the class implementing it.
The Definition of Interface has changed over time. Do you think Interface just have method declarations only ? What about static final variables and what about default definitions after Java 5.
Interfaces were introduced to Java because of the Diamond problem with multiple Inheritance and that's what they actually intend to do.
Interfaces are the constructs that were created to get away with the multiple inheritance problem and can have abstract methods , default definitions and static final variables.
http://www.quora.com/Why-does-Java-allow-static-final-variables-in-interfaces-when-they-are-only-intended-to-be-contracts
In short, The basic problem an interface is trying to solve is to separate how we use something from how it is implemented. But you should consider interface is not a contract. Read more here.
I seldom use inheritance, but when I do, I never use protected attributes because I think it breaks the encapsulation of the inherited classes.
Do you use protected attributes ? what do you use them for ?
In this interview on Design by Bill Venners, Joshua Bloch, the author of Effective Java says:
Trusting Subclasses
Bill Venners: Should I trust subclasses more intimately than
non-subclasses? For example, do I make
it easier for a subclass
implementation to break me than I
would for a non-subclass? In
particular, how do you feel about
protected data?
Josh Bloch: To write something that is both subclassable and robust
against a malicious subclass is
actually a pretty tough thing to do,
assuming you give the subclass access
to your internal data structures. If
the subclass does not have access to
anything that an ordinary user
doesn't, then it's harder for the
subclass to do damage. But unless you
make all your methods final, the
subclass can still break your
contracts by just doing the wrong
things in response to method
invocation. That's precisely why the
security critical classes like String
are final. Otherwise someone could
write a subclass that makes Strings
appear mutable, which would be
sufficient to break security. So you
must trust your subclasses. If you
don't trust them, then you can't allow
them, because subclasses can so easily
cause a class to violate its
contracts.
As far as protected data in general,
it's a necessary evil. It should be
kept to a minimum. Most protected data
and protected methods amount to
committing to an implementation
detail. A protected field is an
implementation detail that you are
making visible to subclasses. Even a
protected method is a piece of
internal structure that you are making
visible to subclasses.
The reason you make it visible is that
it's often necessary in order to allow
subclasses to do their job, or to do
it efficiently. But once you've done
it, you're committed to it. It is now
something that you are not allowed to
change, even if you later find a more
efficient implementation that no
longer involves the use of a
particular field or method.
So all other things being equal, you
shouldn't have any protected members
at all. But that said, if you have too
few, then your class may not be usable
as a super class, or at least not as
an efficient super class. Often you
find out after the fact. My philosophy
is to have as few protected members as
possible when you first write the
class. Then try to subclass it. You
may find out that without a particular
protected method, all subclasses will
have to do some bad thing.
As an example, if you look at
AbstractList, you'll find that there
is a protected method to delete a
range of the list in one shot
(removeRange). Why is that in there?
Because the normal idiom to remove a
range, based on the public API, is to
call subList to get a sub-List,
and then call clear on that
sub-List. Without this particular
protected method, however, the only
thing that clear could do is
repeatedly remove individual elements.
Think about it. If you have an array
representation, what will it do? It
will repeatedly collapse the array,
doing order N work N times. So it will
take a quadratic amount of work,
instead of the linear amount of work
that it should. By providing this
protected method, we allow any
implementation that can efficiently
delete an entire range to do so. And
any reasonable List implementation
can delete a range more efficiently
all at once.
That we would need this protected
method is something you would have to
be way smarter than me to know up
front. Basically, I implemented the
thing. Then, as we started to subclass
it, we realized that range delete was
quadratic. We couldn't afford that, so
I put in the protected method. I think
that's the best approach with
protected methods. Put in as few as
possible, and then add more as needed.
Protected methods represent
commitments to designs that you may
want to change. You can always add
protected methods, but you can't take
them out.
Bill Venners: And protected data?
Josh Bloch: The same thing, but even more. Protected data is even more
dangerous in terms of messing up your
data invariants. If you give someone
else access to some internal data,
they have free reign over it.
Short version: it breaks encapsulation but it's a necessary evil that should be kept to a minimum.
C#:
I use protected for abstract or virtual methods that I want base classes to override. I also make a method protected if it may be called by base classes, but I don't want it called outside the class hierarchy.
You may need them for static (or 'global') attribute you want your subclasses or classes from same package (if it is about java) to benefit from.
Those static final attributes representing some kind of 'constant value' have seldom a getter function, so a protected static final attribute might make sense in that case.
Scott Meyers says don't use protected attributes in Effective C++ (3rd ed.):
Item 22: Declare data members private.
The reason is the same you give: it breaks encapsulations. The consequence is that otherwise local changes to the layout of the class might break dependent types and result in changes in many other places.
I don't use protected attributes in Java because they are only package protected there. But in C++, I'll use them in abstract classes, allowing the inheriting class to inherit them directly.
There are never any good reasons to have protected attributes. A base class must be able to depend on state, which means restricting access to data through accessor methods. You can't give anyone access to your private data, even children.
I recently worked on a project were the "protected" member was a very good idea. The class hiearchy was something like:
[+] Base
|
+--[+] BaseMap
| |
| +--[+] Map
| |
| +--[+] HashMap
|
+--[+] // something else ?
The Base implemented a std::list but nothing else. The direct access to the list was forbidden to the user, but as the Base class was incomplete, it relied anyway on derived classes to implement the indirection to the list.
The indirection could come from at least two flavors: std::map and stdext::hash_map. Both maps will behave the same way but for the fact the hash_map needs the Key to be hashable (in VC2003, castable to size_t).
So BaseMap implemented a TMap as a templated type that was a map-like container.
Map and HashMap were two derived classes of BaseMap, one specializing BaseMap on std::map, and the other on stdext::hash_map.
So:
Base was not usable as such (no public accessors !) and only provided common features and code
BaseMap needed easy read/write to a std::list
Map and HashMap needed easy read/write access to the TMap defined in BaseMap.
For me, the only solution was to use protected for the std::list and the TMap member variables. There was no way I would put those "private" because I would anyway expose all or almost all of their features through read/write accessors anyway.
In the end, I guess that if you en up dividing your class into multiple objects, each derivation adding needed features to its mother class, and only the most derived class being really usable, then protected is the way to go. The fact the "protected member" was a class, and so, was almost impossible to "break", helped.
But otherwise, protected should be avoided as much as possible (i.e.: Use private by default, and public when you must expose the method).
The protected keyword is a conceptual error and language design botch, and several modern languages, such as Nim and Ceylon (see http://ceylon-lang.org/documentation/faq/language-design/#no_protected_modifier), that have been carefully designed rather than just copying common mistakes, don't have such a keyword.
It's not protected members that breaks encapsulation, it's exposing members that shouldn't be exposed that breaks encapsulation ... it doesn't matter whether they are protected or public. The problem with protected is that it is wrongheaded and misleading ... declaring members protected (rather than private) doesn't protect them, it does the opposite, exactly as public does. A protected member, being accessible outside the class, is exposed to the world and so its semantics must be maintained forever, just as is the case for public. The whole idea of "protected" is nonsense ... encapsulation is not security, and the keyword just furthers the confusion between the two. You can help a little by avoiding all uses of protected in your own classes -- if something is an internal part of the implementation, isn't part of the class's semantics, and may change in the future, then make it private or internal to your package, module, assembly, etc. If it is an unchangeable part of the class semantics, then make it public, and then you won't annoy users of your class who can see that there's a useful member in the documentation but can't use it, unless they are creating their own instances and can get at it by subclassing.
In general, no you really don't want to use protected data members. This is doubly true if your writing an API. Once someone inherits from your class you can never really do maintenance and not somehow break them in a weird and sometimes wild way.
I use them. In short, it's a good way, if you want to have some attributes shared. Granted, you could write set/get functions for them, but if there is no validation, then what's the point? It's also faster.
Consider this: you have a class which is your base class. It has quite a few attributes you wan't to use in the child objects. You could write a get/set function for each, or you can just set them.
My typical example is a file/stream handler. You want to access the handler (i.e. file descriptor), but you want to hide it from other classes. It's way easier than writing a set/get function for it.
I think protected attributes are a bad idea. I use CheckStyle to enforce that rule with my Java development teams.
In general, yes. A protected method is usually better.
In use, there is a level of simplicity given by using a protected final variable for an object that is shared by all the children of a class. I'd always advise against using it with primitives or collections since the contracts are impossible to define for those types.
Lately I've come to separate stuff you do with primitives and raw collections from stuff you do with well-formed classes. Primitives and collections should ALWAYS be private.
Also, I've started occasionally exposing public member variables when they are declaired final and are well-formed classes that are not too flexible (again, not primitives or collections).
This isn't some stupid shortcut, I thought it out pretty seriously and decided there is absolutely no difference between a public final variable exposing an object and a getter.
It depends on what you want. If you want a fast class then data should be protected and use protected and public methods.
Because I think you should assume that your users who derive from your class know your class quite well or at least they have read your manual at the function they going to override.
If your users mess with your class it is not your problem. Every malicious user can add the following lines when overriding one of your virtuals:
(C#)
static Random rnd=new Random();
//...
if (rnd.Next()%1000==0) throw new Exception("My base class sucks! HAHAHAHA! xD");
//...
You can't seal every class to prevent this.
Of course if you want a constraint on some of your fields then use accessor functions or properties or something you want and make that field private because there is no other solution...
But I personally don't like to stick to the oop principles at all costs. Especially making properties with the only purpose to make data members private.
(C#):
private _foo;
public foo
{
get {return _foo;}
set {_foo=value;}
}
This was my personal opinion.
But do what your boss require (if he wants private fields than do that.)
I use protected variables/attributes within base classes that I know I don't plan on changing into methods. That way, subclasses have full access to their inherited variables, and don't have the (artificially created) overhead of going through getters/setters to access them. An example is a class using an underlying I/O stream; there is little reason not to allow subclasses direct access to the underlying stream.
This is fine for member variables that are used in direct simple ways within the base class and all subclasses. But for a variable that has a more complicated use (e.g., accessing it causes side effects in other members within the class), a directly accessible variable is not appropriate. In this case, it can be made private and public/protected getters/setters can be provided instead. An example is an internal buffering mechanism provided by the base class, where accessing the buffers directly from a subclass would compromise the integrity of the algorithms used by the base class to manage them.
It's a design judgment decision, based on how simple the member variable is, and how it is expected to be so in future versions.
Encapsulation is great, but it can be taken too far. I've seen classes whose own private methods accessed its member variables using only getter/setter methods. This is overkill, since if a class can't trust its own private methods with its own private data, who can it trust?