Understanding the Liskov Substitution Principle (LSP) - oop

After reading this post I think I mostly understand LSP and most of the examples, but I can’t say I’m 100% certain from my experience of many examples of inheritance, as it seems that many examples do violate LSP and it seems difficult not to when overriding behaviour.
For instance, consider the following simple demonstration of inheritance, taken from Head First Object Oriented Analysis & Design. Aren't they violating LSP with the Jet child class?
public class Airplane {
private int speed;
public void setSpeed(int speed) {
this.speed = speed;
}
public int getSpeed() {
return speed;
}
}
public class Jet extends Airplane {
private static final int MULTIPLIER=2;
/**
* The subclass can change behaviour of its superclass, as well as call the
* superclass's methods. This is called overriding the superclass's behaviour
*/
public void set setSpeed(int speed) {
super.setSpeed(speed * MULTIPLIER);
}
public void accelerate() {
super.setSpeed(getSpeed() * 2);
}
}
A client using a reference to an instance of base class Airplane might be surprised, after setting the speed, to find it is twice as fast as expected after being passed an instance of a Jet object. Isn't Jet changing the post-conditions for the setSpeed() method and thus violating LSP?
E.g.
void takeAirplane(Airplane airplane) {
airplane.setSpeed(10);
assert airplane.getSpeed()==10;
}
This will clearly fail if takeAirplane is passed a reference to a Jet object.
It seems to me that it will be difficult not to violate LSP when “overriding a superclass’s behaviour”, yet this is one of the main/desirable features of inheritance!
Can someone explain or help clarify this? Am I missing something?

According to Wikipedia
[The Liskov substitution principle] states that, in a computer program, if S is a subtype of T, then objects of type T may be replaced with objects of type S (i.e., objects of type S may substitute objects of type T) without altering any of the desirable properties of that program (correctness, task performed, etc.).
In the case of the Jet, it's speed being twice as fast is a violation of the LSP - it fails the postcondition that setSpeed(getSpeed(x))==x
The Liskov Substitution Principle says that it's ok to modify the behaviour of in a derived class, provided the correctness of the program does not change. It imposes restrictions on the changes you make in the derived classes.

Related

Why is overriding of static methods left out of most OOP languages?

It is certainly not for good OOP design - as the need for common behavior of all instances of a derived class is quite valid conceptually. Moreover, it would make for so much cleaner code if one could just say Data.parse(file), have the common parse() code in the base class and let overriding do its magic than having to implement mostly similar code in all data subtypes and be careful to call DataSybtype.parse(file) - ugly ugly ugly
So there must be a reason - like Performance ?
As a bonus - are there OOP languages that do allow this ?
Java-specific arguments are welcome as that's what I am used to - but I believe the answer is language agnostic.
EDIT : one could ideally :
<T> void method(Iface<? extends T> ifaceImpl){
T.staticMeth(); // here the right override would be called
}
This will also fail due to erasure (in java at least) - if erasure is at work one needs (would need) to actually pass the class :
<T, K extends T> void method(Iface<K> ifaceImpl, Class<K> cls){
cls.staticMeth(); // compile error
}
Does it make sense ? Are there languages doing this already ? Is there a workaround apart from reflection ?
Speaking to C++
class Foo {
public:
static void staticFn(int i);
virtual void virtFn(int i);
};
The virtual function is a member function - that is, it is called with a this pointer from which to look up the vtable and find the correct function to call.
The static function, explicitly, does not operate on a member, so there is no this object from which to look up the vtable.
When you invoke a static member function as above, you are explicitly providing a fixed, static, function pointer.
foo->virtFn(1);
expands out to something vaguely like
foo->_vtable[0](foo, 1);
while
foo->staticFn(1);
expands to a simple function call
Foo##staticFn(1);
The whole point of "static" is that it is object-independent. Thus it would be impossible to virtualize.

'is instanceof' Interface bad design

Say I have a class A
class A
{
Z source;
}
Now, the context tells me that 'Z' can be an instance of different classes (say, B and C) which doesn't share any common class in their inheritance tree.
I guess the naive approach is to make 'Z' an Interface class, and make classes B and C implement it.
But something still doesn't convince me because every time an instance of class A is used, I need to know the type of 'source'. So all finishes in multiple 'ifs' making 'is instanceof' which doesn't sound quite nice. Maybe in the future some other class implements Z, and having hardcoded 'ifs' of this type definitely could break something.
The escence of the problem is that I cannot resolve the issue by adding functions to Z, because the work done in each instance type of Z is different.
I hope someone can give me and advice, maybe about some useful design pattern.
Thanks
Edit: The work 'someone' does in some place when get some instance of A is totally different depending of the class behind the interface Z. That's the problem, the entity that does the 'important job' is not Z, is someone else that wants to know who is Z.
Edit2: Maybe a concrete example would help:
class Picture
{
Artist a;
}
interface Artist
{
}
class Human : Artist { }
class Robot : Artist {}
Now somewhere I have an instance of Picture,
Picture p = getPicture();
// Now is the moment that depending if the type of `p.a` different jobs are done
// it doesn't matter any data or logic inside Human or Robot
The point of using an interface is to hide these different implementations; A should just know the intent or high-level purpose of the method(s).
The work done by each implementation of Z may be different, but the method signature used to invoke that work can be the same. Class A can just call method Z.foo(), and depending on whether the implementation of Z is B or C, different code will be executed.
The only time you need to know the real implementation type is when you need to carry out completely unrelated processing on the two different types, and they don't share an interface. But in that case, why are they being processed by the same class A? Now, there are cases where this may make sense, such as when B and C are classes generated from XML Schemas, and you can't modify them - but generally it indicates that the design can be improved.
Updated now that you've added the Picture example. I think this confirms my point - although the implementation of getPicture() is different, the purpose and the return type are the same. In both cases, the Artist returns a Picture.
If the caller wants to treat Robot-created and Human-created pictures in the same way, then they use the Artist interface. They do not need to distinguish between Human or Robot, because they just want a picture! The details of how the picture is created belong in the subclass, and the caller should not see these details. If the caller cares about precisely how a picture is created, then the caller should paint it, not the Robot or Human, and the design would be quite different.
If your subclasses are performing totally unrelated tasks (and this is not what your Artist example shows!) then you might use a very vague interface such as the standard Java Runnable; in this case, the caller really has no idea what the run() method will do - it just knows how to run things that are Runnable.
Links
The following questions/articles suggest some alternatives to instanceof:
Avoiding instanceof in Java
Alternative to instanceof approach in this case
And the following articles also gives example code, using an example that seems similar to yours:
http://www.javapractices.com/topic/TopicAction.do?Id=31
and the following articles discuss the tradeoffs of instanceof versus other approaches such as the Visitor pattern and Acyclic Visitor:
https://sites.google.com/site/steveyegge2/when-polymorphism-fails
http://butunclebob.com/ArticleS.UncleBob.VisitorVersusInstanceOf
I think you need to post more information, because as it stands what I see is a misunderstanding of OOP principles. If you used a common interface type, then by Liskov substitution principle it shouldn't matter which type source is.
I'm gonna call your A, B, and C classes Alpha, Beta, and Gamma.
Perhaps Alpha can be split into two versions, one which uses Betas and one which uses Gammas. This would avoid the instanceof checks in Alpha, which as you've surmised are indeed a code smell.
abstract class Alpha
{
abstract void useSource();
}
class BetaAlpha extends Alpha
{
Beta source;
void useSource() { source.doSomeBetaThing(); }
}
class GammaAlpha extends Alpha
{
Gamma source;
void useSource() { source.doSomeGammaThing(); }
}
In fact this is extremely common. Consider a more concrete example of a Stream class that can use either Files or Sockets. And for the purpose of the example, File and Socket are not derived from any common base class. In fact they may not even be under our control, so we can't change them.
abstract class Stream
{
abstract void open();
abstract void close();
}
class FileStream extends Stream
{
File file;
void open() { file.open(); }
void close() { file.close(); }
}
class SocketStream extends Stream
{
Socket socket;
void open() { socket.connect(); }
void close() { socket.disconnect(); }
}

Are there adverse effects of passing around objects rather than assigning them as members of a class

I have a habit of creating classes that tend to pass objects around to perform operations on them rather than assigning them to a member variable and having operations refer to the member variable. It feels much more procedural to me than OO.
Is this a terrible practice? If so, what are the adverse effects (performance, memory consumption, more error-prone)? Is it simply easier and more closely aligned to OO principles like encapsulation to favour member variables?
A contrived example of what I mean is below. I tend to do the following;
public class MyObj()
{
public MyObj() {}
public void DoVariousThings(OtherObj oo)
{
if (Validate(oo))
{
Save(oo);
}
}
private bool Validate(OtherObj oo)
{
// Do stuff related to validation
}
private bool Save(OtherObj oo)
{
// Do stuff related to saving
}
}
whereas I suspect I should be doing the following;
public class MyObj()
{
private OtherObj _oo;
public MyObj(OtherObj oo)
{
_oo = oo;
}
public void DoVariousThings()
{
if (Validate())
{
Save();
}
}
private bool Validate()
{
// Do stuff related to validation with _oo
}
private bool Save()
{
// Do stuff related to saving with _oo
}
}
If you write your programs in an object oriented language, people will expect object oriented code. As such, in your first example, they would probably expect that the reason for making oo a parameter is that you will use different objects for it all the time. In your second example, they would know that you always use the same instance (as initialized in the constructor).
Now, if you use the same object all the time, but still write your code like in your first example, you will have them thoroughly confused. When an interface is well designed, it should be obvious how to use it. This is not the case in your first example.
I think you already answered your question yourself, you seem to be aware of the fact that the 2nd approach is more favorable in general and should be used (unless there are serious reasons for the first approach).
Advantages that come to my mind immediately:
Simplified readability and maintainability, both for you and for others
Only one entry point, therefore only needing to checking for != null etc.
In case you want to put that class under test, it's way easier, i.e., getting something like this (extracting interface IOtherObj from OtherObj and working with that):
public MyObj (IOtherObj oo)
{
if (oo == null) throw...
_oo = oo;
}
Talking of the adverse effects of your way, there are none, but only if you are keeping the programs and the code to yourself,, what are you doing is NOT a standard thing, say, if after some time, you start to work making libraries and code that may be used by others also, then it is a big problem. The may pass any foo object and hope that it would work.
you have to validate the object before passing it and if the validation fails do things accordingly, but if u use the standard OOP way, there is no need for validation or taking up the cases where an inappropriate type object is pass,
In a nutshell, your way is bad for :
1. code re-usability.
2. you have to handle more exceptions.
3. okay, if u r keeping things to urself, otherwise, not a good practice.
hope, it cleared some doubt.

Why is statically resolving virtual method calls so difficult?

Suppose we have the following pseudo code. I am talking about OO languages.
class A{
foo(){...}
}
class B extends A{
foo(){...}
}
class C extends B{
foo(){...}
}
static void f(A a)
{
A a=new A();
a=new B();
a.foo();
}
It's easy for us to recognize that a.foo() is calling function foo overridden in class B. So why it's hard for compilers to get this truth by static analysis? The fundamental question here is why statically determine the type of A is hard for a compiler?
The example you posted is extremely simplistic and does not show anything that requires a virtual method call. With your same classes, examine this function;
void bar(A* a) {
a->foo();
}
There is no way the compiler can tell at compile-time if a is an instance of B, or C, or a plain A. That can only be decided at runtime in the general case.
The compiler can't even know if there will be new classes derived from A at some future point that will be linked with this code.
Just imagine:
A a = createInstanceFromString("B");
Now you're screwed.
On a serious note, your example is way too simplistic. Imagine if a right-hand side of an assignment is a call to a function defined in some other "module" (whatever this means). This means that the compiler has to inspect all execution paths in order to determine the exact type of a return value, but that's prohibitively expensive and sometimes downright impossible.

Encapsulation. Well-designed class

Today I read a book and the author wrote that in a well-designed class the only way to access attributes is through one of that class methods. Is it a widely accepted thought? Why is it so important to encapsulate the attributes? What could be the consequences of not doing it? I read somewhere earlier that this improves security or something like that. Any example in PHP or Java would be very helpful.
Is it a widely accepted thought?
In the object-oriented world, yes.
Why is it so important to encapsulate the attributes? What could be the consequences of not doing it?
Objects are intended to be cohesive entities containing data and behavior that other objects can access in a controlled way through a public interface. If an class does not encapsulate its data and behavior, it no longer has control over the data being accessed and cannot fulfill its contracts with other objects implied by the public interface.
One of the big problems with this is that if a class has to change internally, the public interface shouldn't have to change. That way it doesn't break any code and other classes can continue using it as before.
Any example in PHP or Java would be very helpful.
Here's a Java example:
public class MyClass {
// Should not be < 0
public int importantValue;
...
public void setImportantValue(int newValue) {
if (newValue < 0) {
throw new IllegalArgumentException("value cannot be < 0");
}
}
...
}
The problem here is that because I haven't encapsulated importantValue by making it private rather than public, anyone can come along and circumvent the check I put in the setter to prevent the object from having an invalid state. importantValue should never be less than 0, but the lack of encapsulation makes it impossible to prevent it from being so.
What could be the consequences of not
doing it?
The whole idea behind encapsulation is that all knowledge of anything related to the class (other than its interface) is within the class itself. For example, allowing direct access to attributes puts the onus of making sure any assignments are valid on the code doing the assigning. If the definition of what's valid changes, you have to go through and audit everything using the class to make sure they conform. Encapsulating the rule in a "setter" method means you only have to change it in one place, and any caller trying anything funny can get an exception thrown at it in return. There are lots of other things you might want to do when an attribute changes, and a setter is the place to do it.
Whether or not allowing direct access for attributes that don't have any rules to bind them (e.g., anything that fits in an integer is okay) is good practice is debatable. I suppose that using getters and setters is a good idea for the sake of consistency, i.e., you always know that you can call setFoo() to alter the foo attribute without having to look up whether or not you can do it directly. They also allow you to future-proof your class so that if you have additional code to execute, the place to put it is already there.
Personally, I think having to use getters and setters is clumsy-looking. I'd much rather write x.foo = 34 than x.setFoo(34) and look forward to the day when some language comes up with the equivalent of database triggers for members that allow you to define code that fires before, after or instead of a assignments.
Opinions on how "good OOD" is achieved are dime a dozen, and also very experienced programmers and designers tend to disagree about design choices and philosophies. This could be a flame-war starter, if you ask people across a wide varieties of language background and paradigms.
And yes, in theory are theory and practice the same, so language choice shouldn't influence high level design very much. But in practice they do, and good and bad things happen because of that.
Let me add this:
It depends. Encapsulation (in a supporting language) gives you some control over how you classes are used, so you can tell people: this is the API, and you have to use this. In other languages (e.g. python) the difference between official API and informal (subject to change) interfaces is by naming convention only (after all, we're all consenting adults here)
Encapsulation is not a security feature.
Another thought to ponder
Encapsulation with accessors also provides much better maintainability in the future. In Feanor's answer above, it works great to enforce security checks (assuming your instvar is private), but it can have much further reaching benifits.
Consider the following scenario:
1) you complete your application, and distribute it to some set of users (internal, external, whatever).
2) BigCustomerA approaches your team and wants an audit trail added to the product.
If everyone is using the accessor methods in their code, this becomes almost trivial to implement. Something like so:
MyAPI Version 1.0
public class MyClass {
private int importantValue;
...
public void setImportantValue(int newValue) {
if (newValue < 0) {
throw new IllegalArgumentException("value cannot be < 0");
}
importantValue = newValue;
}
...
}
MyAPI V1.1 (now with audit trails)
public class MyClass {
private int importantValue;
...
public void setImportantValue(int newValue) {
if (newValue < 0) {
throw new IllegalArgumentException("value cannot be < 0");
}
this.addAuditTrail("importantValue", importantValue, newValue);
importantValue = newValue;
}
...
}
Existing users of the API make no changes to their code and the new feature (audit trail) is now available.
Without encapsulation using accessors your faced with a huge migration effort.
When coding for the first time, it will seem like a lot of work. Its much faster to type: class.varName = something vs class.setVarName(something); but if everyone took the easy way out, getting paid for BigCustomerA's feature request would be a huge effort.
In Object Oriente Programming there is a principle that is known as (http://en.wikipedia.org/wiki/Open/closed_principle):
POC --> Principle of Open and Closed. This principle stays for: a well class design should be opened for extensibility (inheritance) but closed for modification of internal members (encapsulation). It means that you could not be able to modify the state of an object without taking care about it.
So, new languages only modify internal variables (fields) through properties (getters and setters methods in C++ or Java). In C# properties compile to methods in MSIL.
C#:
int _myproperty = 0;
public int MyProperty
{
get { return _myproperty; }
set { if (_someVarieble = someConstantValue) { _myproperty = value; } else { _myproperty = _someOtherValue; } }
}
C++/Java:
int _myproperty = 0;
public void setMyProperty(int value)
{
if (value = someConstantValue) { _myproperty = value; } else { _myproperty = _someOtherValue; }
}
public int getMyProperty()
{
return _myproperty;
}
Take theses ideas (from Head First C#):
Think about ways the fields can misused. What can go wrong if they're not set properly.
Is everything in your class public? Spend some time thinking about encapsulation.
What fields require processing or calculation? They are prime candidates.
Only make fields and methods public if you need to. If you don't have a reason to declare something public, don't.