Composition and Inversion of Control

Composition and Inversion of Control - oop

I just came across Inversion of Control approach (implemented using Dependency Injection) of designing loosely coupled software architecture. As per my understanding the IOC approach aims to solve problem related to tight coupling between classes by instantiating an object of a class inside another class which should ideally not happen (as per the pattern). Is my understanding correct here?
If above is true than what about composition or has-a relationship (the very basic important aspect of OO). For an example I write my stack class using a linked list class already defined so I instantiate a linked list class inside my stack class. But as per IOC this will result in tight coupling and hence a bad design. Is this true? I am bit confused here between composition or has-a relationship and IOC.

As per my understanding the IOC approach aims to solve problem related
to tight coupling between classes by instantiating an object of a
class inside another class which should ideally not happen (as per the
pattern). Is my understanding correct here?
Close, but you are slightly off. The problem of tight coupling is addressed when you define contracts between classes (interfaces in Java). Since you need implementations of your contracts(interfaces), at some point those implementations must be provided. IoC is one way of providing an implementation, but not the only way. So tight coupling is really orthogonal to Inversion of Control (meaning it's not directly related).
More specifically, you can have loose coupling but no IoC. The IoC part is that the implementations are coming from outside of the components. Consider the case where you define a class that uses an interface implementation. When you test that class, you might provide a mock. When you pass the mock to the class under test, you are not using IoC. However when you start your app, and the IoC container decides what to pass to your class, that's the IoC.
For an example I write my stack class using a linked list class
already defined so I instantiate a linked list class inside my stack
class. But as per IOC this will result in tight coupling and hence a
bad design. Is this true? I am bit confused here between composition
or has-a relationship and IOC.
Yes and No. In the general sense, you don't need to completely abstract every bit of functionality in your app. You can, and purists probably would, but it can be tedious and over-done.
In this case, you could treat your stack as a black box, and not manage it with IoC. Remember, the Stack itself is loosely couple because the Stack's behavior can be abstracted away. Also, consider the following two definitions
class StackImpl implements Stack {
private List backingList
vs
class StackImpl implements Stack {
private LinkedList backingList
The first is vastly superior to the second, precisely because it's easier to change List implementations; i.e. you have already provided a loose coupling.
That's as far as I would take it. Besides, if you are using composition, you can certainly configure most IoC containers (if not all) to pass things to the constructor or invoke setters, so you can still have a has-A relationship.

Good implementations of IoC can fulfill the "has a" pattern, but just abstract the implementation of the child.
For example, every business layer class may, by your design, "have a" exception handler; with IoC you can define it so that the exception handler that actually gets instantiated at runtime be different in different environments.
The most value in IoC is if you are doing lots of automated testing; in these scenarios you can instantiate mock data access components in your test environment, but have real data access components instantiated in production, which keeps your tests clean. The downside of IoC is that it's harder to debug, since everything is more abstract.

I have my doubts as to my understanding of Inversion of Control too. (It seems like an application of good OO design principles given a fancy name) So, let me assume you are a beginner, analyse your example and clarify my thoughts on the path.
We should start by defining an interface IStack.
interface IStack<T>
{
bool IsEmpty();
T Pop();
void Push(T item);
}
In a way we are already finished; the rest of the code probably will not care whether we implemented it with linked lists, or arrays, or whatever. StackWithLinkedList : IStack and StackWithArray : IStack will behave the same.
class StackWithLinkedList<T> : IStack<T>
{
private LinkedList<T> list;
public StackWithLinkedList<T>()
{
list = new LinkedList<T>();
}
}
So StackWithLinkedList totally owns the list; it does not need any help from outside to construct it, it does not need any flexibility (that line will never change) and the clients of StackWithLinkedList couldn't care less (they have no access to the list). In short, this is not a good example to discuss Inversion of Control: we don't need any.
Let's discuss a similar example, PriorityQueue<T> :
interface IPriorityQueue<T>
{
bool IsEmpty();
T Dequeue();
void Enqueue(T item);
}
Now we have a problem: we need to compare items of type T to provide an implementation of a IPriorityQueue. Clients still do not care whether we use an array, or a heap or whatever inside, but they do care about how we compare items. We could require T to implement IComparable<T> but that would be an unnecessary restriction. What we need is some piece of functionality that will compare T items by our request:
class PriorityQueue<T> : IPriorityQueue<T>
{
private Func<T,T,int> CompareTo;
private LinkedList<T> list;
//bla bla.
}
Such that:
if CompareTo(left,right) < 0 then left < right (in some sense)
if CompareTo(left,right) > 0 then left > right (in some sense)
if CompareTo(left,right) = 0 then left = right (in some sense)
(We would also require CompareTo to be consistent, etc. but that's another topic)
The problem is how to initialize CompareTo.
One option might be, -let's suppose there is a generic comparison creator somewhere- use the comparison creator. (I agree, the example is becoming a little silly)
public PriorityQueue()
{
this.CompareTo = ComparisonCreator<T>.CreateComparison();
this.list = new LinkedList<T>();
}
Or, perhaps even something like: ServiceLocator.Instance.ComparisonCreator<T>.CreateComparison();
This is not an ideal solution for the following reasons:
PriorityQueue is now (very unnecessarily) dependant on ComparisonCreator. If it is on a different assembly, it has to reference it. If someone changes ComparisonCreator he has to make sure PriorityQueue is not affected.
The clients will have a difficult time to use the PriorityQueue. They will first need to make sure that the ComparisonCreator is constructed and initialized.
The clients will have a difficult time to change the default behaviour. Suppose somewhere a client needs a different CompareTo function. There is no easy solution. For example, if it changes the ComparisonCreator<T>'s behaviour, it may affect other clients. What if there are other threads. Even in a single thread environment the client will probably need to undo the change on construction. It's too much effort just to make it work.
For the same reasons, it is difficult to unit test the PriorityQueue. One needs to set up the whole environment.
Of course, - and of course you knew this all along - there is a much easier way in this specific problem. Just provide the CompareTo function in the constructor:
public PriorityQueue(Func<T,T,int> CompareTo)
{
this.CompareTo = CompareTo;
this.list = new LinkedList<T>();
}
Let's check:
PriorityQueue is independent of ComparisonCreator.
For the clients, probably it is much easier to use PriorityQueue. They may need to provide a CompareTo function, but at the worst case they can always ask the ServiceLocator, so al least it is never more difficult.
Changing the default behaviour is very easy. Just give a different CompareTo function. What one client does, does not affect other clients.
It is very easy to unit test PriorityQueue. There is no complex environment to set up. We can easily test it with different CompareTo functions, etc.
What we did is called "constructor injection" because we injected a dependency in the constructor. By giving the needed dependency at the construction, we were able to change the PriorityQueue into a "self sufficient" class. We still create a LinkedList<T>, a concrete class in the construction for the same reasons in Stack example: it is not a real dependency.

The tight coupling in your stack example comes from the stack intantiating a specific list type. The IOC allows the creator of the stack type to provide which exact list implementation to use (e.g. for performance or testing purposes), realizing that the stack does not (at least should not) care what the exact type of the list is as long as it has a specific interface (the methods that stack wants to use) and the concetere implementation provides the required semantics (e.g. iterating through the list will give access to all elements added to the list in the order they were added).

As per my understanding the IOC approach aims to solve problem related
to tight coupling between classes by instantiating an object of a
class inside another class which should ideally not happen (as per the
pattern). Is my understanding correct here?
IoC is actually quite a broad concept, so let's restrict the field to the Dependency Injection approach that you are referring to. Yes, Dependency Injection does what you said.
I think the reason why hvgotcodes thinks that you are slightly off is that the concept of tight coupling can be thought as of having multiple levels. Programming to interfaces is the way to abstract from a particular implementation, which keeps the usage of some piece of code some client code interacts with and its implementation loosely coupled.
The implementation has to be created (instantiated) somewhere though: even if you program to an interface, if the implementation is created inside the client code you are bound to that particular implementation.
So we can abstract the implementation from the interface, but we can also abstract the choice of which implementation to use.
As soon as this detail is clear, you have to ask yourself when it makes sense to abstract the choice of the implementation, which is basically one of the fundamental questions of software engineering: when should you abstract what? The answer to the question is of course context dependent.
But as per IOC this will result in tight coupling and hence a bad
design. Is this true?
If tight coupling is bad design, why are you still relying on standard Java classes? We actually need to distinguish between stable and volatile dependencies.
Citing your example, if you are using the standard implementation of a list, you probably may not want to inject this dependency into your class. What would you achieve by doing this? Do you expect the standard implementation of the list to change any time soon, or do you want to be able to inject a different implementation of a standard list?
On the other hand, suppose you have a custom list with some sort of change tracking mechanism, so that you can perform undo and redo operations on it. Now it could make sense to inject it, because you may want to be able to unit test the client class in isolation, without incurring in potential bugs of your custom list implementation.
As you see, tight coupling is not always bad, sometimes it makes sense, sometimes it is to be avoided: in the end it comes down to the type of dependency.

Related

Is this the right understanding of SOLID Object Oriented principles?

After reading about SOLID in a few places, I was having trouble mapping between explanations with different vocabularies and code. To generalize a bit, I created the diagrams below, and I was hoping that people could point out any 'bugs' in my understanding.
Of course, feel free to reuse/remix/redistribute as you'd like!

I think your diagrams look quite nice, but I'm afraid that I couldn't understand them (particularly the interface one), so I'll comment on the text.
It's not really clear to me what you mean by layer, in the Open/closed I thought you might mean interface, but the interface and dependency items suggest you don't mean that.
Open/closed : actually your text from the Liskov item is closer to describing open/closed. If the code is open for extension, we can make use of it (by extending it) to implement new requirements, but by not modifying the existing code (it's closed for modification) we know we wont break any existing code that made use of it.
"Only depend on outer layer" - if this means only depend on an interface not the implementation, then yes, that's an important principle for SOLID code even though it doesn't map directly to any of the 5 letters.
Dependency inversion uses that but goes beyond it. A piece of code can make use of another via its interface and this is has great maintainability benefits over relying on the implementation, but if the calling code still has the responsibility for creating the object (and therefore choosing the class) that implements the interface then it still has a dependency. If we create the concrete object outside the class or method and pass it in as an interface, then the called code no longer depends on the concrete class, just the interface
void SomeFunction()
{
IThing myIthing* = new ConcreteThing();
// code below can use the interface but this function depends on the Concrete class
}
void SomeFunctionDependencyInjectedVersion(IThing myIthing*)
{
// this version should be able to work with any class that implements the IThing interface,
// whether it really can might depend on some of the other SOLID principles
}
Single responsibility : this isn't about classes intersecting, this is about not giving a code more than one responsibility. If you have a function where you can't think of a better name than doSomethingAndSomethingElse this might be a sign its got more than one responsibility and could be better if it was split (the point I'm making is about the "and" in the name even when the "somethings" are better named).
You should try to define that responsibility so that the class can perform it entirely, (although it make may use of other classes that perform sub-responsibilities for it) but at each level of responsibility that a class is defined it should have one clear reason to exist. When it has more than one it can make code harder to understand, and changes to code related to one responsibility can have unwanted side-effects on other responsibilities.
Iterface segregation: Consider a class implementing a collection. The class will implement code to add to the collection or to read from it. We could put all this in one interface, but if we separate it then when we have consuming code that only needs to read and doesn't need to add to the collection then it can use the interface made of the reading methods. This can make the code clearer in that it shows quickly that the code only needs those methods, and, if we've injected the collection by interface we could also use that code with a different source of items that doesn't have the ability to add items
(consider IEnumerable vs ICollection vs IList)
Liskov substitution is all about making sure that objects that inherit from an interface/base class behave in the way that the interface/base class promised to behave. In the strictest interpretation of the original definition they'd need to behave exactly the same, but that's not all that useful. More generally its about behaving in a consistent and expected way, the derived classes may add functionality, but they should be able to do the job of the base objects (they can be substituted for them)

Should concrete implementation provide any public API not present in the interface it implements?

"Code to interfaces" is considered good practice. Such code is easy to unit test and enables loose coupling. Users only know the interfaces and the onus of wiring concrete objects is upon the top-most level (this can be done in some init code or with the help of frameworks).
My question is about following the practice of code to interfaces: does it imply that a concrete class can never declare any public method which is not present in its interface?
Otherwise, it will force users to depend upon the concrete implementation. This will make such methods difficult for unit testing; if the test fails, determining if it failed due to an issue in the caller code or due to the concrete method will require extra effort. This will also break the Dependency Inversion Principle. It will induce type-checking and down-casting, which are considered bad practice.

That is totally acceptable provided that the new methods aren't crucial to the operating of the class, and in particular to how it functions when someone thinks of it as the superclass or interface.
ArrayList provides good examples. It has methods that let you manage its internal memory, like ensureCapacity(int) or trimToSize(). Those are sometimes helpful if you know you're working with an ArrayList and need to be more precise about memory allocation, but they're not required for the basic operation of the ArrayList, and in particular, they're not required for having it operate as a general List.
In fact, interfaces themselves can add new methods in this way. Consider NavigableSet, which extends Set. It adds a whole bunch of methods that rely on the ordering of the set's elements (give me the first, the last, a subtree starting from here, etc). None of those methods are defined on Set, and even the fact that the elements are ordered isn't defined by the Set contract; but the Set methods all work just fine without the additional methods and ordering.
The advice to "code to the interface" is a good start, but it's a bit over-generalized. A refinement of that advice would be, "code to the most general interface that you need." If you don't need ArrayLists's methods (or its contract, such as its random-access performance), code to List; but if you do need them, then by all means use them.

#yshavit's third paragraph hits it right. Implement an extension of the "not enough" base interface, as exampled with public interface NavigableSet<E> extends SortedSet<E> (which, BTW, extends Set<E> extends Collection<E> extends Iterable<E>).
It's his second paragraph that troubles me. Why have "non-crucial" methods of the API that are not surfaced in some interface being implemented? In the ArrayList example, why not have the size management methods declared in an interface? Perhaps ManagedSize which would describe clear behavior for ArrayList (and other) classes to implement, along with the several other interfaces it implements (my JRE source says: public class ArrayList<E> extends AbstractList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable).
With such an approach, there is no need to decide which methods are "non-crucial," only to be surprised by some client code that depends on things like ensureSize to help avoid relocation during a time-critical phase, or trimToSize to release excessive overalloaction when it's algorthmically known that further growth will not be needed. Not that I'm promoting such algorthms as best practice, but even non-functional "behavior management" methods deserve their place in the light.
Finally, while I agree with sentiment of "Know Where the Lines Are, and yet Color As You See Fit" it doesn't give practical guidance. Here's attempt at such:
Always start by coding to an interface, ie. all concrete public methods should be declared in an interface:
Use multiple interfaces as needed
Each interface should partition the implemented API into coherent non-overlapping aspects, e.g. List, RandomAccess, Cloneable, Serializable
Tend to start with larger scoped interfaces and break them up as the design develops (before coding ala Waterfall, or as code evolves ala Agile); interfaces are one of the easier design artefacts to refactor.
If a given interface you are implementing is "insufficient":
Extend the base interface and add the methods you need, then implement that one, OR
Create an augmenting interface (like the ManagedSize idea, above) with just the additional methods and then implement them both
Only when you find you can't do that, then relax only as much of the rule as you need to make things work (often, this will be an experimental trial-error "does it work, yet?" cycle).
Reasons for #3's "can't" will vary, but I expect them to be external to the application design, e.g. the ORM I'm using becomes confused, the IDE plug-in doesn't refactor it correctly, the DSL translator I'm forced to use fails when a class implements more than three interfaces...

Is Inheritance really needed?

I must confess I'm somewhat of an OOP skeptic. Bad pedagogical and laboral experiences with object orientation didn't help. So I converted into a fervent believer in Visual Basic (the classic one!).
Then one day I found out C++ had changed and now had the STL and templates. I really liked that! Made the language useful. Then another day MS decided to apply facial surgery to VB, and I really hated the end result for the gratuitous changes (using "end while" instead of "wend" will make me into a better developer? Why not drop "next" for "end for", too? Why force the getter alongside the setter? Etc.) plus so much Java features which I found useless (inheritance, for instance, and the concept of a hierarchical framework).
And now, several years afterwards, I find myself asking this philosophical question: Is inheritance really needed?
The gang-of-four say we should favor object composition over inheritance. And after thinking of it, I cannot find something you can do with inheritance you cannot do with object aggregation plus interfaces. So I'm wondering, why do we even have it in the first place?
Any ideas? I'd love to see an example of where inheritance would be definitely needed, or where using inheritance instead of composition+interfaces can lead to a simpler and easier to modify design. In former jobs I've found if you need to change the base class, you need to modify also almost all the derived classes for they depended on the behaviour of parent. And if you make the base class' methods virtual... then not much code sharing takes place :(
Else, when I finally create my own programming language (a long unfulfilled desire I've found most developers share), I'd see no point in adding inheritance to it...

Really really short answer: No. Inheritance is not needed because only byte code is truly needed. But obviously, byte code or assemble is not a practically way to write your program. OOP is not the only paradigm for programming. But, I digress.
I went to college for computer science in the early 2000s when inheritance (is a), compositions (has a), and interfaces (does a) were taught on an equal footing. Because of this, I use very little inheritance because it is often suited better by composition. This was stressed because many of the professors had seen bad code (along with what you have described) because of abuse of inheritance.
Regardless of creating a language with or without inheritances, can you create a programming language which prevents bad habits and bad design decisions?

I think asking for situations where inheritance is really needed is missing the point a bit. You can fake inheritance by using an interface and some composition. This doesnt mean inheritance is useless. You can do anything you did in VB6 in assembly code with some extra typing, that doesn't mean VB6 was useless.
I usually just start using an interface. Sometimes I notice I actually want to inherit behaviour. That usually means I need a base class. It's that simple.

Inheritance defines an "Is-A" relationship.
class Point( object ):
# some set of features: attributes, methods, etc.
class PointWithMass( Point ):
# An additional feature: mass.
Above, I've used inheritance to formally declare that PointWithMass is a Point.
There are several ways to handle object P1 being a PointWithMass as well as Point. Here are two.
Have a reference from PointWithMass object p1 to some Point object p1-friend. The p1-friend has the Point attributes. When p1 needs to engage in Point-like behavior, it needs to delegate the work to its friend.
Rely on language inheritance to assure that all features of Point are also applicable to my PointWithMass object, p1. When p1 needs to engage in Point-like behavior, it already is a Point object and can just do what needs to be done.
I'd rather not manage the extra objects floating around to assure that all superclass features are part of a subclass object. I'd rather have inheritance to be sure that each subclass is an instance of it's own class, plus is an instance of all superclasses, too.
Edit.
For statically-typed languages, there's a bonus. When I rely on the language to handle this, a PointWithMass can be used anywhere a Point was expected.
For really obscure abuse of inheritance, read about C++'s strange "composition through private inheritance" quagmire. See Any sensible examples of creating inheritance without creating subtyping relations? for some further discussion on this. It conflates inheritance and composition; it doesn't seem to add clarity or precision to the resulting code; it only applies to C++.

The GoF (and many others) recommend that you only favor composition over inheritance. If you have a class with a very large API, and you only want to add a very small number of methods to it, leaving the base implementation alone, I would find it inappropriate to use composition. You'd have to re-implement all of the public methods of the encapsulated class to just return their value. This is a waste of time (programmer and CPU) when you can just inherit all of this behavior, and spend your time concentrating on new methods.
So, to answer your question, no you don't absolutely need inheritance. There are, however, many situations where it's the right design choice.

The problem with inheritance is that it conflates the issue of sub-typing (asserting an is-a relationship) and code reuse (e.g., private inheritance is for reuse only).
So, no it's an overloaded word that we don't need. I'd prefer sub-typing (using the 'implements' keyword) and import (kinda like Ruby does it in class definitions)

Inheritance lets me push off a whole bunch of bookkeeping onto the compiler because it gives me polymorphic behavior for object hierarchies that I would otherwise have to create and maintain myself. Regardless of how good a silver bullet OOP is, there will always be instances where you want to employ a certain type of behavior because it just makes sense to do. And ultimately, that's the point of OOP: it makes a certain class of problems much easier to solve.

The downsides of composition is that it may disguise the relatedness of elements and it may be harder for others to understand. With,say, a 2D Point class and the desire to extend it to higher dimensions, you would presumably have to add (at least) Z getter/setter, modify getDistance(), and maybe add a getVolume() method. So you have the Objects 101 elements: related state and behavior.
A developer with a compositional mindset would presumably have defined a getDistance(x, y) -> double method and would now define a getDistance(x, y, z) -> double method. Or, thinking generally, they might define a getDistance(lambdaGeneratingACoordinateForEveryAxis()) -> double method. Then they would probably write createTwoDimensionalPoint() and createThreeDimensionalPoint() factory methods (or perhaps createNDimensionalPoint(n) ) that would stitch together the various state and behavior.
A developer with an OO mindset would use inheritance. Same amount of complexity in the implementation of domain characteristics, less complexity in terms of initializing the object (constructor takes care of it vs. a Factory method), but not as flexible in terms of what can be initialized.
Now think about it from a comprehensibility / readability standpoint. To understand the composition, one has a large number of functions that are composed programmatically inside another function. So there's little in terms of static code 'structure' (files and keywords and so forth) that makes the relatedness of Z and distance() jump out. In the OO world, you have a great big flashing red light telling you the hierarchy. Additionally, you have an essentially universal vocabulary to discuss structure, widely known graphical notations, a natural hierarchy (at least for single inheritance), etc.
Now, on the other hand, a well-named and constructed Factory method will often make explicit more of the sometimes-obscure relationships between state and behavior, since a compositional mindset facilitates functional code (that is, code that passes state via parameters, not via this ).
In a professional environment with experienced developers, the flexibility of composition generally trumps its more abstract nature. However, one should never discount the importance of comprehensibility, especially in teams that have varying degrees of experience and/or high levels of turnover.

Inheritance is an implementation decision. Interfaces almost always represent a better design, and should usually be used in an external API.
Why write a lot of boilerplate code forwarding method calls to a composed member object when the compiler will do it for you with inheritance?
This answer to another question summarises my thinking pretty well.

Does anyone else remember all of the OO-purists going ballistic over the COM implementation of "containment" instead of "inheritance?" It achieved essentially the same thing, but with a different kind of implementation. This reminds me of your question.
I strictly try to avoid religious wars in software development. ("vi" OR "emacs" ... when everybody knows its "vi"!) I think they are a sign of small minds. Comp Sci Professors can afford to sit around and debate these things. I'm working in the real world and could care less. All of this stuff are simply attempts at giving useful solutions to real problems. If they work, people will use them. The fact that OO languages and tools have been commercially available on a wide scale for going on 20 years is a pretty good bet that they are useful to a lot of people.

There are a lot of features in a programming language that are not really needed. But they are there for a variety of reasons that all basically boil down to reusability and maintainability.
All a business cares about is producing (quality of course) cheaply and quickly.
As a developer you help do this is by becoming more efficient and productive. So you need to make sure the code you write is easily reusable and maintainable.
And, among other things, this is what inheritance gives you - the ability to reuse without reinventing the wheel, as well as the ability to easily maintain your base object without having to perform maintenance on all similar objects.

There's lots of useful usages of inheritance, and probably just as many which are less useful. One of the useful ones is the stream class.
You have a method that should be able stream data. By using the stream base class as input to the method you ensure that your method can be used to write to many kinds of streams without change. To the file system, over the network, with compression, etc.

No.
for me, OOP is mostly about encapsulation of state and behavior and polymorphism.
and that is. but if you want static type checking, you'll need some way to group different types, so the compiler can check while still allowing you to use new types in place of another, related type. creating a hierarchy of types lets you use the same concept (classes) for types and for groups of types, so it's the most widely used form.
but there are other ways, i think the most general would be duck typing, and closely related, prototype-based OOP (which isn't inheritance in fact, but it's usually called prototype-based inheritance).

Depends on your definition of "needed". No, there is nothing that is impossible to do without inheritance, although the alternative may require more verbose code, or a major rewrite of your application.
But there are definitely cases where inheritance is useful. As you say, composition plus interfaces together cover almost all cases, but what if I want to supply a default behavior? An interface can't do that. A base class can. Sometimes, what you want to do is really just override individual methods. Not reimplement the class from scratch (as with an interface), but just change one aspect of it. or you may not want all members of the class to be overridable. Perhaps you have only one or two member methods you want the user to override, and the rest, which calls these (and performs validation and other important tasks before and after the user-overridden methods) are specified once and for all in the base class, and can not be overridden.
Inheritance is often used as a crutch by people who are too obsessed with Java's narrow definition of (and obsession with) OOP though, and in most cases I agree, it's the wrong solution, as if the deeper your class hierarchy, the better your software.

Inheritance is a good thing when the subclass really is the same kind of object as the superclass. E.g. if you're implementing the Active Record pattern, you're attempting to map a class to a table in the database, and instances of the class to a row in the database. Consequently, it is highly likely that your Active Record classes will share a common interface and implementation of methods like: what is the primary key, whether the current instance is persisted, saving the current instance, validating the current instance, executing callbacks upon validation and/or saving, deleting the current instance, running a SQL query, returning the name of the table that the class maps to, etc.
It also seems from how you phrase your question that you're assuming that inheritance is single but not multiple. If we need multiple inheritance, then we have to use interfaces plus composition to pull off the job. To put a fine point about it, Java assumes that implementation inheritance is singular and interface inheritance can be multiple. One need not go this route. E.g. C++ and Ruby permit multiple inheritance for your implementation and your interface. That said, one should use multiple inheritance with caution (i.e. keep your abstract classes virtual and/or stateless).
That said, as you note, there are too many real-life class hierarchies where the subclasses inherit from the superclass out of convenience rather than bearing a true is-a relationship. So it's unsurprising that a change in the superclass will have side-effects on the subclasses.

Not needed, but usefull.
Each language has got its own methods to write less code. OOP sometimes gets convoluted, but I think that is the responsability of the developers, the OOP platform is usefull and sharp when it is well used.

I agree with everyone else about the necessary/useful distinction.
The reason I like OOP is because it lets me write code that's cleaner and more logically organized. One of the biggest benefits comes from the ability to "factor-up" logic that's common to a number of classes. I could give you concrete examples where OOP has seriously reduced the complexity of my code, but that would be boring for you.
Suffice it to say, I heart OOP.

Absolutely needed? no,
But think of lamps. You can create a new lamp from scratch each time you make one, or you can take properties from the original lamp and make all sorts of new styles of lamp that have the same properties as the original, each with their own style.
Or you can make a new lamp from scratch or tell people to look at it a certain way to see the light, or , or, or
Not required, but nice :)

Thanks to all for your answers. I maintain my position that, strictly speaking, inheritance isn't needed, though I believe I found a new appreciation for this feature.
Something else: In my job experience, I have found inheritance leads to simpler, clearer designs when it's brought in late in the project, after it's noticed a lot of the classes have much commonality and you create a base class. In projects where a grand-schema was created from the very beginning, with a lot of classes in an inheritance hierarchy, refactoring is usually painful and dificult.
Seeing some answers mentioning something similar makes me wonder if this might not be exactly how inheritance's supposed to be used: ex post facto. Reminds me of Stepanov's quote: "you don't start with axioms, you end up with axioms after you have a bunch of related proofs". He's a mathematician, so he ought to know something.

The biggest problem with interfaces is that they cannot be changed. Make an interface public, then change it (add a new method to it) and break million applications all around the world, because they have implemented your interface, but not the new method. The app may not even start, a VM may refuse to load it.
Use a base class (not abstract) other programmers can inherit from (and override methods as needed); then add a method to it. Every app using your class will still work, this method just won't be overridden by anyone, but since you provide a base implementation, this one will be used and it may work just fine for all subclasses of your class... it may also cause strange behavior because sometimes overriding it would have been necessary, okay, might be the case, but at least all those million apps in the world will still start up!
I rather have my Java application still running after updating the JDK from 1.6 to 1.7 with some minor bugs (that can be fixed over time) than not having it running it at all (forcing an immediate fix or it will be useless to people).

//I found this QA very useful. Many have answered this right. But i wanted to add...
1: Ability to define abstract interface - E.g., for plugin developers. Of course, you can use function pointers, but this is better and simpler.
2: Inheritance helps model types very close to their actual relationships. Sometimes a lot of errors get caught at compile time, because you have the right type hierarchy. For instance, shape <-- triangle (lets say there is a lot of code to be reused). You might want to compose triangle with a shape object, but shape is an incomplete type. Inserting dummy implementations like double getArea() {return -1;} will do, but you are opening up room for error. That return -1 can get executed some day!
3: void func(B* b); ... func(new D()); Implicit type conversion gives a great notational convenience since Derived is Base. I remember having read Straustrup saying that he wanted to make classes first class citizens just like fundamental data types (hence overloading operators etc). Implicit conversion from Derived to Base, behaves just like an implicit conversion from a data type to broader compatible one (short to int).

Inheritance and Composition have their own pros and cons.
Refer to this related SE question on pros of inheritance and cons of composition.
Prefer composition over inheritance?
Have a look at the example in this documentation link:
The example shows different use cases of overriding by using inheritance as a mean to achieve polymorphism.

In the following, inheritance is used to present a particular property for all of several specific incarnations of the same type thing. In this case, the GeneralPresenation has a properties that are relevant to all "presentation" (the data passed to an MVC view). The Master Page is the only thing using it and expects a GeneralPresentation, though the specific views expect more info, tailored to their needs.
public abstract class GeneralPresentation
{
public GeneralPresentation()
{
MenuPages = new List<Page>();
}
public IEnumerable<Page> MenuPages { get; set; }
public string Title { get; set; }
}
public class IndexPresentation : GeneralPresentation
{
public IndexPresentation() { IndexPage = new Page(); }
public Page IndexPage { get; set; }
}
public class InsertPresentation : GeneralPresentation
{
public InsertPresentation() {
InsertPage = new Page();
ValidationInfo = new PageValidationInfo();
}
public PageValidationInfo ValidationInfo { get; set; }
public Page InsertPage { get; set; }
}

Must Dependency Injection come at the expense of Encapsulation?

If I understand correctly, the typical mechanism for Dependency Injection is to inject either through a class' constructor or through a public property (member) of the class.
This exposes the dependency being injected and violates the OOP principle of encapsulation.
Am I correct in identifying this tradeoff? How do you deal with this issue?
Please also see my answer to my own question below.

There is another way of looking at this issue that you might find interesting.
When we use IoC/dependency injection, we're not using OOP concepts. Admittedly we're using an OO language as the 'host', but the ideas behind IoC come from component-oriented software engineering, not OO.
Component software is all about managing dependencies - an example in common use is .NET's Assembly mechanism. Each assembly publishes the list of assemblies that it references, and this makes it much easier to pull together (and validate) the pieces needed for a running application.
By applying similar techniques in our OO programs via IoC, we aim to make programs easier to configure and maintain. Publishing dependencies (as constructor parameters or whatever) is a key part of this. Encapsulation doesn't really apply, as in the component/service oriented world, there is no 'implementation type' for details to leak from.
Unfortunately our languages don't currently segregate the fine-grained, object-oriented concepts from the coarser-grained component-oriented ones, so this is a distinction that you have to hold in your mind only :)

It's a good question - but at some point, encapsulation in its purest form needs to be violated if the object is ever to have its dependency fulfilled. Some provider of the dependency must know both that the object in question requires a Foo, and the provider has to have a way of providing the Foo to the object.
Classically this latter case is handled as you say, through constructor arguments or setter methods. However, this is not necessarily true - I know that the latest versions of the Spring DI framework in Java, for example, let you annotate private fields (e.g. with #Autowired) and the dependency will be set via reflection without you needing to expose the dependency through any of the classes public methods/constructors. This might be the kind of solution you were looking for.
That said, I don't think that constructor injection is much of a problem, either. I've always felt that objects should be fully valid after construction, such that anything they need in order to perform their role (i.e. be in a valid state) should be supplied through the constructor anyway. If you have an object that requires a collaborator to work, it seems fine to me that the constructor publically advertises this requirement and ensures it is fulfilled when a new instance of the class is created.
Ideally when dealing with objects, you interact with them through an interface anyway, and the more you do this (and have dependencies wired through DI), the less you actually have to deal with constructors yourself. In the ideal situation, your code doesn't deal with or even ever create concrete instances of classes; so it just gets given an IFoo through DI, without worrying about what the constructor of FooImpl indicates it needs to do its job, and in fact without even being aware of FooImpl's existance. From this point of view, the encapsulation is perfect.
This is an opinion of course, but to my mind DI doesn't necessarily violate encapsulation and in fact can help it by centralising all of the necessary knowledge of internals into one place. Not only is this a good thing in itself, but even better this place is outside your own codebase, so none of the code you write needs to know about classes' dependencies.

This exposes the dependency being injected and violates the OOP principle of encapsulation.
Well, frankly speaking, everything violates encapsulation. :) It's a kind of a tender principle that must be treated well.
So, what violates encapsulation?
Inheritance does.
"Because inheritance exposes a subclass to details of its parent's implementation, it's often said that 'inheritance breaks encapsulation'". (Gang of Four 1995:19)
Aspect-oriented programming does. For example, you register onMethodCall() callback and that gives you a great opportunity to inject code to the normal method evaluation, adding strange side-effects etc.
Friend declaration in C++ does.
Class extention in Ruby does. Just redefine a string method somewhere after a string class was fully defined.
Well, a lot of stuff does.
Encapsulation is a good and important principle. But not the only one.
switch (principle)
{
case encapsulation:
if (there_is_a_reason)
break!
}

Yes, DI violates encapsulation (also known as "information hiding").
But the real problem comes when developers use it as an excuse to violate the KISS (Keep It Short and Simple) and YAGNI (You Ain't Gonna Need It) principles.
Personally, I prefer simple and effective solutions. I mostly use the "new" operator to instantiate stateful dependencies whenever and wherever they are needed. It is simple, well encapsulated, easy to understand, and easy to test. So, why not?

A good depenancy injection container/system will allow for constructor injection. The dependant objects will be encapsulated, and need not be exposed publicly at all. Further, by using a DP system, none of your code even "knows" the details of how the object is constructed, possibly even including the object being constructed. There is more encapsulation in this case since nearly all of your code not only is shielded from knowledge of the encapsulated objects, but does not even participate in the objects construction.
Now, I am assuming you are comparing against the case where the created object creates its own encapsulated objects, most likely in its constructor. My understanding of DP is that we want to take this responsibility away from the object and give it to someone else. To that end, the "someone else", which is the DP container in this case, does have intimate knowledge which "violates" encapsulation; the benefit is that it pulls that knowledge out of the object, iteself. Someone has to have it. The rest of your application does not.
I would think of it this way: The dependancy injection container/system violates encapsulation, but your code does not. In fact, your code is more "encapsulated" then ever.

This is similar to the upvoted answer but I want to think out loud - perhaps others see things this way as well.
Classical OO uses constructors to define the public "initialization" contract for consumers of the class (hiding ALL implementation details; aka encapsulation). This contract can ensure that after instantiation you have a ready-to-use object (i.e. no additional initialization steps to be remembered (er, forgotten) by the user).
(constructor) DI undeniably breaks encapsulation by bleeding implemenation detail through this public constructor interface. As long as we still consider the public constructor responsible for defining the initialization contract for users, we have created a horrible violation of encapsulation.
Theoretical Example:
Class Foo has 4 methods and needs an integer for initialization, so its constructor looks like Foo(int size) and it's immediately clear to users of class Foo that they must provide a size at instantiation in order for Foo to work.
Say this particular implementation of Foo may also need a IWidget to do its job. Constructor injection of this dependency would have us create a constructor like Foo(int size, IWidget widget)
What irks me about this is now we have a constructor that's blending initialization data with dependencies - one input is of interest to the user of the class (size), the other is an internal dependency that only serves to confuse the user and is an implementation detail (widget).
The size parameter is NOT a dependency - it's simple a per-instance initialization value. IoC is dandy for external dependencies (like widget) but not for internal state initialization.
Even worse, what if the Widget is only necessary for 2 of the 4 methods on this class; I may be incurring instantiation overhead for Widget even though it may not be used!
How to compromise/reconcile this?
One approach is to switch exclusively to interfaces to define the operation contract; and abolish the use of constructors by users.
To be consistent, all objects would have to be accessed through interfaces only, and instantiated only through some form of resolver (like an IOC/DI container). Only the container gets to instantiate things.
That takes care of the Widget dependency, but how do we initialize "size" without resorting to a separate initialization method on the Foo interface? Using this solution, we lost the ability to ensure that an instance of Foo is fully initialized by the time you get the instance. Bummer, because I really like the idea and simplicity of constructor injection.
How do I achieve guaranteed initialization in this DI world, when initialization is MORE than ONLY external dependencies?

As Jeff Sternal pointed out in a comment to the question, the answer is entirely dependent on how you define encapsulation.
There seem to be two main camps of what encapsulation means:
Everything related to the object is a method on an object. So, a File object may have methods to Save, Print, Display, ModifyText, etc.
An object is its own little world, and does not depend on outside behavior.
These two definitions are in direct contradiction to each other. If a File object can print itself, it will depend heavily on the printer's behavior. On the other hand, if it merely knows about something that can print for it (an IFilePrinter or some such interface), then the File object doesn't have to know anything about printing, and so working with it will bring less dependencies into the object.
So, dependency injection will break encapsulation if you use the first definition. But, frankly I don't know if I like the first definition - it clearly doesn't scale (if it did, MS Word would be one big class).
On the other hand, dependency injection is nearly mandatory if you're using the second definition of encapsulation.

It doesn't violate encapsulation. You're providing a collaborator, but the class gets to decide how it is used. As long as you follow Tell don't ask things are fine. I find constructer injection preferable, but setters can be fine as well as long as they're smart. That is they contain logic to maintain the invariants the class represents.

Pure encapsulation is an ideal that can never be achieved. If all dependencies were hidden then you wouldn't have the need for DI at all. Think about it this way, if you truly have private values that can be internalized within the object, say for instance the integer value of the speed of a car object, then you have no external dependency and no need to invert or inject that dependency. These sorts of internal state values that are operated on purely by private functions are what you want to encapsulate always.
But if you're building a car that wants a certain kind of engine object then you have an external dependency. You can either instantiate that engine -- for instance new GMOverHeadCamEngine() -- internally within the car object's constructor, preserving encapsulation but creating a much more insidious coupling to a concrete class GMOverHeadCamEngine, or you can inject it, allowing your Car object to operate agnostically (and much more robustly) on for example an interface IEngine without the concrete dependency. Whether you use an IOC container or simple DI to achieve this is not the point -- the point is that you've got a Car that can use many kinds of engines without being coupled to any of them, thus making your codebase more flexible and less prone to side effects.
DI is not a violation of encapsulation, it is a way of minimizing the coupling when encapsulation is necessarily broken as a matter of course within virtually every OOP project. Injecting a dependency into an interface externally minimizes coupling side effects and allows your classes to remain agnostic about implementation.

It depends on whether the dependency is really an implementation detail or something that the client would want/need to know about in some way or another. One thing that is relevant is what level of abstraction the class is targeting. Here are some examples:
If you have a method that uses caching under the hood to speed up calls, then the cache object should be a Singleton or something and should not be injected. The fact that the cache is being used at all is an implementation detail that the clients of your class should not have to care about.
If your class needs to output streams of data, it probably makes sense to inject the output stream so that the class can easily output the results to an array, a file, or wherever else someone else might want to send the data.
For a gray area, let's say you have a class that does some monte carlo simulation. It needs a source of randomness. On the one hand, the fact that it needs this is an implementation detail in that the client really doesn't care exactly where the randomness comes from. On the other hand, since real-world random number generators make tradeoffs between degree of randomness, speed, etc. that the client may want to control, and the client may want to control seeding to get repeatable behavior, injection may make sense. In this case, I'd suggest offering a way of creating the class without specifying a random number generator, and use a thread-local Singleton as the default. If/when the need for finer control arises, provide another constructor that allows for a source of randomness to be injected.

Having struggled with the issue a little further, I am now in the opinion that Dependency Injection does (at this time) violate encapsulation to some degree. Don't get me wrong though - I think that using dependency injection is well worth the tradeoff in most cases.
The case for why DI violates encapsulation becomes clear when the component you are working on is to be delivered to an "external" party (think of writing a library for a customer).
When my component requires sub-components to be injected via the constructor (or public properties) there's no guarantee for
"preventing users from setting the internal data of the component into an invalid or inconsistent state".
At the same time it cannot be said that
"users of the component (other pieces of software) only need to know what the component does, and cannot make themselves dependent on the details of how it does it".
Both quotes are from wikipedia.
To give a specific example: I need to deliver a client-side DLL that simplifies and hides communication to a WCF service (essentially a remote facade). Because it depends on 3 different WCF proxy classes, if I take the DI approach I am forced to expose them via the constructor. With that I expose the internals of my communication layer which I am trying to hide.
Generally I am all for DI. In this particular (extreme) example, it strikes me as dangerous.

I struggled with this notion as well. At first, the 'requirement' to use the DI container (like Spring) to instantiate an object felt like jumping thru hoops. But in reality, it's really not a hoop - it's just another 'published' way to create objects I need. Sure, encapsulation is 'broken' becuase someone 'outside the class' knows what it needs, but it really isn't the rest of the system that knows that - it's the DI container. Nothing magical happens differently because DI 'knows' one object needs another.
In fact it gets even better - by focusing on Factories and Repositories I don't even have to know DI is involved at all! That to me puts the lid back on encapsulation. Whew!

I belive in simplicity. Applying IOC/Dependecy Injection in Domain classes does not make any improvement except making the code much more harder to main by having an external xml files describing the relation. Many technologies like EJB 1.0/2.0 & struts 1.1 are reversing back by reducing the stuff the put in XML and try put them in code as annoation etc. So applying IOC for all the classes you develope will make the code non-sense.
IOC has it benefits when the dependent object is not ready for creation at compile time. This can happend in most of the infrasture abstract level architecture components, trying establish a common base framework which may need to work for different scenarios. In those places usage IOC makes more sense. Still this does not make the code more simple / maintainable.
As all the other technologies, this too has PROs & CONs. My worry is, we implement latest technologies in all the places irrespective of their best context usage.

Encapsulation is only broken if a class has both the responsibility to create the object (which requires knowledge of implementation details) and then uses the class (which does not require knowledge of these details). I'll explain why, but first a quick car anaology:
When I was driving my old 1971 Kombi,
I could press the accelerator and it
went (slightly) quicker. I did not
need to know why, but the guys who
built the Kombi at the factory knew
exactly why.
But back to the coding. Encapsulation is "hiding an implementation detail from something using that implementation." Encapsulation is a good thing because the implementation details can change without the user of the class knowing.
When using dependency injection, constructor injection is used to construct service type objects (as opposed to entity/value objects which model state). Any member variables in service type object represent implementation details that should not leak out. e.g. socket port number, database credentials, another class to call to perform encryption, a cache, etc.
The constructor is relevant when the class is being initially created. This happens during the construction-phase while your DI container (or factory) wires together all the service objects. The DI container only knows about implementation details. It knows all about implementation details like the guys at the Kombi factory know about spark plugs.
At run-time, the service object that was created is called apon to do some real work. At this time, the caller of the object knows nothing of the implementation details.
That's me driving my Kombi to the beach.
Now, back to encapsulation. If implementation details change, then the class using that implementation at run-time does not need to change. Encapsulation is not broken.
I can drive my new car to the beach too. Encapsulation is not broken.
If implementation details change, the DI container (or factory) does need to change. You were never trying to hide implementation details from the factory in the first place.

DI violates Encapsulation for NON-Shared objects - period. Shared objects have a lifespan outside of the object being created, and thus must be AGGREGATED into the object being created. Objects that are private to the object being created should be COMPOSED into the created object - when the created object is destroyed, it takes the composed object with it.
Let's take the human body as an example. What's composed and what's aggregated. If we were to use DI, the human body constructor would have 100's of objects. Many of the organs, for example, are (potentially) replaceable. But, they are still composed into the body. Blood cells are created in the body (and destroyed) everyday, without the need for external influences (other than protein). Thus, blood cells are created internally by the body - new BloodCell().
Advocators of DI argue that an object should NEVER use the new operator.
That "purist" approach not only violates encapsulation but also the Liskov Substitution Principle for whoever is creating the object.

PS. By providing Dependency Injection you do not necessarily break Encapsulation. Example:
obj.inject_dependency( factory.get_instance_of_unknown_class(x) );
Client code does not know implementation details still.

Maybe this is a naive way of thinking about it, but what is the difference between a constructor that takes in an integer parameter and a constructor that takes in a service as a parameter? Does this mean that defining an integer outside the new object and feeding it into the object breaks encapsulation? If the service is only used within the new object, I don't see how that would break encapsulation.
Also, by using some sort of autowiring feature (Autofac for C#, for example), it makes the code extremely clean. By building extension methods for the Autofac builder, I was able to cut out a LOT of DI configuration code that I would have had to maintain over time as the list of dependencies grew.

I think it's self evident that at the very least DI significantly weakens encapsulation. In additional to that here are some other downsides of DI to consider.
It makes code harder to reuse. A module which a client can use without having to explicitly provide dependencies to, is obviously easier to use than one where the client has to somehow discover what that component's dependencies are and then somehow make them available. For example a component originally created to be used in an ASP application may expect to have its dependencies provided by a DI container that provides object instances with lifetimes related to client http requests. This may not be simple to reproduce in another client that does not come with the same built in DI container as the original ASP application.
It can make code more fragile. Dependencies provided by interface specification can be implemented in unexpected ways which gives rise to a whole class of runtime bugs that are not possible with a statically resolved concrete dependency.
It can make code less flexible in the sense that you may end up with fewer choices about how you want it to work. Not every class needs to have all its dependencies in existence for the entire lifetime of the owning instance, yet with many DI implementations you have no other option.
With that in mind I think the most important question then becomes, "does a particular dependency need to be externally specified at all?". In practise I have rarely found it necessary to make a dependency externally supplied just to support testing.
Where a dependency genuinely needs to be externally supplied, that normally suggests that the relation between the objects is a collaboration rather than an internal dependency, in which case the appropriate goal is then encapsulation of each class, rather than encapsulation of one class inside the other.
In my experience the main problem regarding the use of DI is that whether you start with an application framework with built in DI, or you add DI support to your codebase, for some reason people assume that since you have DI support that must be the correct way to instantiate everything. They just never even bother to ask the question "does this dependency need to be externally specified?". And worse, they also start trying to force everyone else to use the DI support for everything too.
The result of this is that inexorably your codebase starts to devolve into a state where creating any instance of anything in your codebase requires reams of obtuse DI container configuration, and debugging anything is twice as hard because you have the extra workload of trying to identify how and where anything was instantiated.
So my answer to the question is this. Use DI where you can identify an actual problem that it solves for you, which you can't solve more simply any other way.

I agree that taken to an extreme, DI can violate encapsulation. Usually DI exposes dependencies which were never truly encapsulated. Here's a simplified example borrowed from Miško Hevery's Singletons are Pathological Liars:
You start with a CreditCard test and write a simple unit test.
#Test
public void creditCard_Charge()
{
CreditCard c = new CreditCard("1234 5678 9012 3456", 5, 2008);
c.charge(100);
}
Next month you get a bill for $100. Why did you get charged? The unit test affected a production database. Internally, CreditCard calls Database.getInstance(). Refactoring CreditCard so that it takes a DatabaseInterface in its constructor exposes the fact that there's dependency. But I would argue that the dependency was never encapsulated to begin with since the CreditCard class causes externally visible side effects. If you want to test CreditCard without refactoring, you can certainly observe the dependency.
#Before
public void setUp()
{
Database.setInstance(new MockDatabase());
}
#After
public void tearDown()
{
Database.resetInstance();
}
I don't think it's worth worrying whether exposing the Database as a dependency reduces encapsulation, because it's a good design. Not all DI decisions will be so straight forward. However, none of the other answers show a counter example.

I think it's a matter of scope. When you define encapsulation (not letting know how) you must define what is the encapsuled functionality.
Class as is: what you are encapsulating is the only responsability of the class. What it knows how to do. By example, sorting. If you inject some comparator for ordering, let's say, clients, that's not part of the encapsuled thing: quicksort.
Configured functionality: if you want to provide a ready-to-use functionality then you are not providing QuickSort class, but an instance of QuickSort class configured with a Comparator. In that case the code responsible for creating and configuring that must be hidden from the user code. And that's the encapsulation.
When you are programming classes, it is, implementing single responsibilities into classes, you are using option 1.
When you are programming applications, it is, making something that undertakes some useful concrete work then you are repeteadily using option 2.
This is the implementation of the configured instance:
<bean id="clientSorter" class="QuickSort">
<property name="comparator">
<bean class="ClientComparator"/>
</property>
</bean>
This is how some other client code use it:
<bean id="clientService" class"...">
<property name="sorter" ref="clientSorter"/>
</bean>
It is encapsulated because if you change implementation (you change clientSorter bean definition) it doesn't break client use. Maybe, as you use xml files with all written together you are seeing all the details. But believe me, the client code (ClientService)
don't know nothing about its sorter.

It's probably worth mentioning that Encapsulation is somewhat perspective dependent.
public class A {
private B b;
public A() {
this.b = new B();
}
}
public class A {
private B b;
public A(B b) {
this.b = b;
}
}
From the perspective of someone working on the A class, in the second example A knows a lot less about the nature of this.b
Whereas without DI
new A()
vs
new A(new B())
The person looking at this code knows more about the nature of A in the second example.
With DI, at least all that leaked knowledge is in one place.

Why should you prevent a class from being subclassed?

What can be reasons to prevent a class from being inherited? (e.g. using sealed on a c# class)
Right now I can't think of any.

Because writing classes to be substitutably extended is damn hard and requires you to make accurate predictions of how future users will want to extend what you've written.
Sealing your class forces them to use composition, which is much more robust.

How about if you are not sure about the interface yet and don't want any other code depending on the present interface? [That's off the top of my head, but I'd be interested in other reasons as well!]
Edit:
A bit of googling gave the following:
http://codebetter.com/blogs/patricksmacchia/archive/2008/01/05/rambling-on-the-sealed-keyword.aspx
Quoting:
There are three reasons why a sealed class is better than an unsealed class:
Versioning: When a class is originally sealed, it can change to unsealed in the future without breaking compatibility. (…)
Performance: (…) if the JIT compiler sees a call to a virtual method using a sealed types, the JIT compiler can produce more efficient code by calling the method non-virtually.(…)
Security and Predictability: A class must protect its own state and not allow itself to ever become corrupted. When a class is unsealed, a derived class can access and manipulate the base class’s state if any data fields or methods that internally manipulate fields are accessible and not private.(…)

I want to give you this message from "Code Complete":
Inheritance - subclasses - tends to
work against the primary technical
imperative you have as a programmer,
which is to manage complexity.For the sake of controlling complexity, you should maintain a heavy bias against inheritance.

The only legitimate use of inheritance is to define a particular case of a base class like, for example, when inherit from Shape to derive Circle. To check this look at the relation in opposite direction: is a Shape a generalization of Circle? If the answer is yes then it is ok to use inheritance.
So if you have a class for which there can not be any particular cases that specialize its behavior it should be sealed.
Also due to LSP (Liskov Substitution Principle) one can use derived class where base class is expected and this is actually imposes the greatest impact from use of inheritance: code using base class may be given an inherited class and it still has to work as expected. In order to protect external code when there is no obvious need for subclasses you seal the class and its clients can rely that its behavior will not be changed. Otherwise external code needs to be explicitly designed to expect possible changes in behavior in subclasses.
A more concrete example would be Singleton pattern. You need to seal singleton to ensure one can not break the "singletonness".

This may not apply to your code, but a lot of classes within the .NET framework are sealed purposely so that no one tries to create a sub-class.
There are certain situations where the internals are complex and require certain things to be controlled very specifically so the designer decided no one should inherit the class so that no one accidentally breaks functionality by using something in the wrong way.

#jjnguy
Another user may want to re-use your code by sub-classing your class. I don't see a reason to stop this.
If they want to use the functionality of my class they can achieve that with containment, and they will have much less brittle code as a result.
Composition seems to be often overlooked; all too often people want to jump on the inheritance bandwagon. They should not! Substitutability is difficult. Default to composition; you'll thank me in the long run.

I am in agreement with jjnguy... I think the reasons to seal a class are few and far between. Quite the contrary, I have been in the situation more than once where I want to extend a class, but couldn't because it was sealed.
As a perfect example, I was recently creating a small package (Java, not C#, but same principles) to wrap functionality around the memcached tool. I wanted an interface so in tests I could mock away the memcached client API I was using, and also so we could switch clients if the need arose (there are 2 clients listed on the memcached homepage). Additionally, I wanted to have the opportunity to replace the functionality altogether if the need or desire arose (such as if the memcached servers are down for some reason, we could potentially hot swap with a local cache implementation instead).
I exposed a minimal interface to interact with the client API, and it would have been awesome to extend the client API class and then just add an implements clause with my new interface. The methods that I had in the interface that matched the actual interface would then need no further details and so I wouldn't have to explicitly implement them. However, the class was sealed, so I had to instead proxy calls to an internal reference to this class. The result: more work and a lot more code for no real good reason.
That said, I think there are potential times when you might want to make a class sealed... and the best thing I can think of is an API that you will invoke directly, but allow clients to implement. For example, a game where you can program against the game... if your classes were not sealed, then the players who are adding features could potentially exploit the API to their advantage. This is a very narrow case though, and I think any time you have full control over the codebase, there really is little if any reason to make a class sealed.
This is one reason I really like the Ruby programming language... even the core classes are open, not just to extend but to ADD AND CHANGE functionality dynamically, TO THE CLASS ITSELF! It's called monkeypatching and can be a nightmare if abused, but it's damn fun to play with!

From an object-oriented perspective, sealing a class clearly documents the author's intent without the need for comments. When I seal a class I am trying to say that this class was designed to encapsulate some specific piece of knowledge or some specific service. It was not meant to be enhanced or subclassed further.
This goes well with the Template Method design pattern. I have an interface that says "I perform this service." I then have a class that implements that interface. But, what if performing that service relies on context that the base class doesn't know about (and shouldn't know about)? What happens is that the base class provides virtual methods, which are either protected or private, and these virtual methods are the hooks for subclasses to provide the piece of information or action that the base class does not know and cannot know. Meanwhile, the base class can contain code that is common for all the child classes. These subclasses would be sealed because they are meant to accomplish that one and only one concrete implementation of the service.
Can you make the argument that these subclasses should be further subclassed to enhance them? I would say no because if that subclass couldn't get the job done in the first place then it should never have derived from the base class. If you don't like it then you have the original interface, go write your own implementation class.
Sealing these subclasses also discourages deep levels of inheritence, which works well for GUI frameworks but works poorly for business logic layers.

Because you always want to be handed a reference to the class and not to a derived one for various reasons:
i. invariants that you have in some other part of your code
ii. security
etc
Also, because it's a safe bet with regards to backward compatibility - you'll never be able to close that class for inheritance if it's release unsealed.
Or maybe you didn't have enough time to test the interface that the class exposes to be sure that you can allow others to inherit from it.
Or maybe there's no point (that you see now) in having a subclass.
Or you don't want bug reports when people try to subclass and don't manage to get all the nitty-gritty details - cut support costs.

Sometimes your class interface just isn't meant to be inheirited. The public interface just isn't virtual and while someone could override the functionality that's in place it would just be wrong. Yes in general they shouldn't override the public interface, but you can insure that they don't by making the class non-inheritable.
The example I can think of right now are customized contained classes with deep clones in .Net. If you inherit from them you lose the deep clone ability.[I'm kind of fuzzy on this example, it's been a while since I worked with IClonable] If you have a true singelton class, you probably don't want inherited forms of it around, and a data persistence layer is not normally place you want a lot of inheritance.

Not everything that's important in a class is asserted easily in code. There can be semantics and relationships present that are easily broken by inheriting and overriding methods. Overriding one method at a time is an easy way to do this. You design a class/object as a single meaningful entity and then someone comes along and thinks if a method or two were 'better' it would do no harm. That may or may not be true. Maybe you can correctly separate all methods between private and not private or virtual and not virtual but that still may not be enough. Demanding inheritance of all classes also puts a huge additional burden on the original developer to foresee all the ways an inheriting class could screw things up.
I don't know of a perfect solution. I'm sympathetic to preventing inheritance but that's also a problem because it hinders unit testing.

I exposed a minimal interface to interact with the client API, and it would have been awesome to extend the client API class and then just add an implements clause with my new interface. The methods that I had in the interface that matched the actual interface would then need no further details and so I wouldn't have to explicitly implement them. However, the class was sealed, so I had to instead proxy calls to an internal reference to this class. The result: more work and a lot more code for no real good reason.
Well, there is a reason: your code is now somewhat insulated from changes to the memcached interface.

Performance: (…) if the JIT compiler sees a call to a virtual method using a sealed types, the JIT compiler can produce more efficient code by calling the method non-virtually.(…)
That's a great reason indeed. Thus, for performance-critical classes, sealed and friends make sense.
All the other reasons I've seen mentioned so far boil down to "nobody touches my class!". If you're worried someone might misunderstand its internals, you did a poor job documenting it. You can't possibly know that there's nothing useful to add to your class, or that you already know every imaginable use case for it. Even if you're right and the other developer shouldn't have used your class to solve their problem, using a keyword isn't a great way of preventing such a mistake. Documentation is. If they ignore the documentation, their loss.

Most of answers (when abstracted) state that sealed/finalized classes are tool to protect other programmers against potential mistakes. There is a blurry line between meaningful protection and pointless restriction. But as long as programmer is the one who is expected to understand the program, I see no hardly any reasons to restrict him from reusing parts of a class. Most of you talk about classes. But it's all about objects!
In his first post, DrPizza claims that designing inheritable class means anticipating possible extensions. Do I get it right that you think that class should be inheritable only if it's likely to be extended well? Looks as if you were used to design software from the most abstract classes. Allow me a brief explanation of how do I think when designing:
Starting from the very concrete objects, I find characteristics and [thus] functionality that they have in common and I abstract it to superclass of those particular objects. This is a way to reduce code duplicity.
Unless developing some specific product such as a framework, I should care about my code, not others (virtual) code. The fact that others might find it useful to reuse my code is a nice bonus, not my primary goal. If they decide to do so, it's their responsibility to ensure validity of extensions. This applies team-wide. Up-front design is crucial to productivity.
Getting back to my idea: Your objects should primarily serve your purposes, not some possible shoulda/woulda/coulda functionality of their subtypes. Your goal is to solve given problem. Object oriented languages uses fact that many problems (or more likely their subproblems) are similar and therefore existing code can be used to accelerate further development.
Sealing a class forces people who could possibly take advantage of existing code WITHOUT ACTUALLY MODIFYING YOUR PRODUCT to reinvent the wheel. (This is a crucial idea of my thesis: Inheriting a class doesn't modify it! Which seems quite pedestrian and obvious, but it's being commonly ignored).
People are often scared that their "open" classes will be twisted to something that can not substitute its ascendants. So what? Why should you care? No tool can prevent bad programmer from creating bad software!
I'm not trying to denote inheritable classes as the ultimately correct way of designing, consider this more like an explanation of my inclination to inheritable classes. That's the beauty of programming - virtually infinite set of correct solutions, each with its own cons and pros. Your comments and arguments are welcome.
And finally, my answer to the original question: I'd finalize a class to let others know that I consider the class a leaf of the hierarchical class tree and I see absolutely no possibility that it could become a parent node. (And if anyone thinks that it actually could, then either I was wrong or they don't get me).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas