Question about LSP (Liskov Substitution Principle) and subtypes - oop

LSP says that
if q(x) is a property provable about objects x of type T then q(y) should be true for objects y of type S where S is a subtype of T.
I can rephrase it as follows:
q(x) is true for any x of T => q(y) is true for any y of any subtype of T
Now what about another statement ?
q(x) is true for any x of T and q(y) is true for any y of S => S is a subtype of T
Does it make sense ? Can we use it as a definition of subtype ?

q(x) is true for any x of T and q(y) is true for any y of S => S is a subtype of T
The answer is No. What the expression means is that a common supertype R of S and T could be defined, and that then the LSP (shame on how that name became mainstream) would hold for T->R and S->R.
In typing theory, there are types, that include semantics, and there are implementations of the types that abide to the semantics, perhaps by inheriting implementations.
In practice, the only reasonable way to specify the semantics of a type (the q(x) part) is through an implementation, so we are left with semantic-less signatures in the form of interfaces, and classes that inherit for implementation purposes, and implement the interfaces they like, with no way to check if they are doing it correctly.
Researches have tried to define formal languages to specify types, so tools can check if an implementation abides to type definitions, but the effort is so large that it would do as good to compile the formal language into executable code. It's a Catch-22 situation that I think will never be solved.
Back to your original question, in languages that allow what today is called "Duck Typing", the answer is undecidable, because an object of any type can be passed to any function, and the typing is right if the correct signatures are implemented and the result is right. Let me explain...
In a language like Eiffel you could place a postcondition on List.append() that List.length() must increase after the operation. That is not the way languages like Perl, JavaScript, Python, or even Java work. That lack of type-strictness allows for much more succinct code than stricter type definitions would.

It does not make sense; your statement using and is symmetric in S and T.
But I think you meant to say the following
If it is the case that for any proposition q such that q(x) is provable for all x of type T, then q(y) is also provable for all y:of type S, than we may consider S a subtype of T.
I would prefer to use mathematical logic rather than informal English, but if I have got the definition right, this is behavioral subtyping, which these days is often called "duck typing." It's a perfectly good subtyping principle and again leads to the idea that in any context that expects a value of type T, you may instead supply a value of type S, and it's OK because the value of type S is guaranteed to satisfy all properties that are expected by the context.

I think no, you can't use it as a definition. Besides if q(x) is true for any x of T and q(y) is true for any y of S
it could also mean that T is a subtype of S.
To be sure of which is a subtype of which (assuming you know that there is an inheritance relationship between them) you also have to know something about which is more "generic"
or which is more "specialized" than the other.

Related

The rule for preconditions/postconditions of derivatives

In his paper about LSP, uncle Bob mentioned :
Now the rule for the preconditions and postconditions for derivatives, as stated by Meyer, is:
...when redefining a routine [in a derivative], you may only replace its
precondition by a weaker one, and its postcondition by a stronger one.
How could I tell if the preconditions/postconditions of a subtype instance object's method are respectively weaker/stronger than those of the supertype's method?
To formulate it without rigorous definitions:
If your parent class requires something, the child must provide the same functionality - at least.
If your routine promises to handle all inputs that are greater than zero, your derived routine must also accept all those, or more inputs.
That means that the precondition can only be weaker.
Similarly, the postcondition must be stronger. That means that you are not allowed to return a negative number in your derived routine, if the original routine promised that it will always return a positive number.
If you were to require more than what the parent requires (i.e. if you had a stronger prerequisite), then you could not be sure that you could always call that routine. Let's say that B and C are subclasses of A. Sometimes, you might have an Object of type A, which could be also actually a B or a C. If C had stronger prerequisites than A, you could run into issues when calling the routine on that Object.
I'm sorry if I didn't use the usual terminology, I can't really recall that so I just tried to stick with what makes sense to me. (It's been two years since I last attended a lecture by Bertrand Meyer)

Why is type inference impractical for object oriented languages?

I'm currently researching ideas for a new programming language where ideally I would like the language to mix some functional and procedural (object oriented) concepts.
One of the things that I'm really fascinated about with languages like Haskell is that it's statically typed, but you do not have to annotate types (magic thanks to Hindley-Milner!).
I would really like this for my language, however after reading up on the subject it seems that most agree that type inference is impractical/impossible with subtyping/object orientation, however I have not understood why this is. I do not know F#, but I understand that it uses Hindley-Milner AND is object-oriented.
I would really like an explanation for this and preferably examples on scenarios where type inference is impossible for object oriented languages.
To add to seppk's response: with structural object types the problem he describes actually goes away (f could be given a polymorphic type like ∀A ≤ {x : Int, y : Int}. A → Int, or even just {x : Int, y : Int} → Int). However, type inference is still problematic.
The fundamental reason is this: in a language without subtyping, the typing rules impose equality constraints on types. These are very nice to deal with during type checking, because they can usually be simplified immediately using unification on the types. With subtyping, however, these constraints are generalised to inequality constraints. You cannot use unification any more, which has at least three unpleasant consequences:
The number and complexity of constraints explodes combinatorially.
The information you have to display to the user in case of errors is incomprehensible.
Certain forms of quantification can quickly become undecidable.
Thus, type inference for subtyping is not impossible (there have been many papers on the subject in the 90s), but it is not very practical.
A much simpler alternative is employed by OCaml, which uses so-called row polymorphism in place of subtyping. That is actually tractable.
When using nominal typing (that is a typing system where two classes whose members have the same name and the same types are not interchangeable), there would be many possible types for a method like the following:
let f(obj) =
obj.x + obj.y
Any class that has both a member x and a member y (of types that support the + operator) would qualify as a possible type for obj and the type inference algorithm would have no way of knowing which one is the one you want.
In F# the above code would need a type annotation. So F# has object orientation and type inference, but not at the same time (with the exception of local type inference (let myVar = expressionWhoseTypeIKNow), which always works).

Is every method returning `this` a monad?

Is every method on a class which returns this a monad?
I'm going to say a very cautious "possibly". A lot of this is contingent on your definitions.
It's worth noting that I'm taking the definition of monad from the category theory construct, not the functional programming construct.
If you think of a method A of class C that maps a C instance to another C instance (i.e. it returns this), then this would appear that C.A() is a functor from the category consisting of C instantiations to itself. Therefore it's an endofunctor, at least. It would appear that this construction obeys the basic identity and associativity properties that we expect, but further inspection would be required to say for sure.
Anyway, I wouldn't stake my life on it, and I'm not certain this is a very helpful way about thinking of such constructions, but it does seem a reasonable assumption on first inspection, at least.
I have limited understanding of monads. I can't tell if that meets the formal definition of a monad (I don't think so, but I don't know for sure), but return this; alone doesn't allow any of the cool things monads allow (fluid interfaces are nice, but not monads imho and nowhere as useful as even simple monads like the option type monad).
This snippet from wikipedia seems to say "no":
Formally, a monad is constructed by defining two operations (bind and return) and a type constructor M [... further restrictions we don't need here]
Edit: Moreover, a monad is a type and not an operation (e.g. method) - the question should rather read "Is a class a monad if all of its methods return this?"</nitpick >
Probably not, at least not in any of the usual ways.
Monads in programming are typically defined over a category of types with functions as arrows. In that case, a method returning this is an arrow from the class to itself--this is an endomorphism with the usual monoid of function composition, but is not a functor.
Note that functors involving function types are certainly possible, but a functor F(A) => (A -> A) doesn't really work because the type appears in both covariant and contravariant position, that is, given a function A -> B you can send A -> A to A -> B, or you can send B -> B to A -> B, but you can't get a B -> B from A -> A or vice versa.
However, there is one way to view instances as having monadic structure. Consider that instance methods effectively have this as an implicit argument. So for some class C, its methods are functions from C to whatever other type. This corresponds roughly to the covariant function functor above. Note that I'm not describing any particular class here, but the entire concept of classes and instances! So, for this mapping from C to instance methods of C:
If we have an instance method returning some type A and a function with type A -> B, we can trivially define a method returning something of type B: that's the rest of the functor definition, a.k.a. 'fmap` in Haskell.
If we have some value of type A, we can add a trivial instance method that just returns that value: that's the monad's "unit" operation, a.k.a. return in Haskell.
If we have an instance method returning a value of type A, and another instance method taking an argument of type A and returning a value of type B, we can define a method that simply returns a value of type B by combining them. That's the monadic bind, a.k.a. (>>=) in Haskell.
Haskell calls the monad of "functions that all take a first argument of some fixed type" the Reader Monad, and the do notation for it lets you write code where that first argument is implicitly available--rather like the way that this is implicitly available inside instance methods.
The difference here is that with class instances, the monadic structure is... sort of at the level of the syntax, not something you can use directly in a program, at least not in most languages.
In my opinion, No.
There are at least two issues I see with it.
A monad is often a glue between two functions. In this case methodA returns a type on which the next methodB is invoked, (and of course methodA and methodB both belonging to the same type).
A monad is supposed to allow type transformations. So if functionA returns TypeX and functionB expects TypeY, the monad needs to provide a bind operation which can convert a Monad(TypeX) into a Monad(TypeY). The monad then goes on to take the return value of the first function, wrap it as a Monad(TypeX), transform it to Monad(TypeY) from which TypeY would get extracted and fed into functionB.
A method which returns this is actually an implementation of Fluent Interface. And while many have argued it to be a monadic as well, I would only say that while it helps resolve problems similar to what monads could otherwise solve, and while the solution would seem similar to how a monadic solution might work (instead of the "." operator, the bind method of the monad has to be invoked without any explicit do block), it is not a monad. In other words it may walk like a monad and talk like a monad, but it is not a monad.
Slight Correction to point 2: The monad needs to provide mechanisms to a) convert TypeX into Monad(TypeX), transform from Monad(TypeX) to Monad(TypeY) and a coercion from Monad(TypeY) to TypeY

What is the difference between the concept of 'class' and 'type'?

i know this question has been already asked, but i didnt get it quite right, i would like to know, which is the base one, class or the type. I have few questions, please clear those for me,
Is type the base of a programing data type?
type is hard coded into the language itself. Class is something we can define ourselves?
What is untyped languages, please give some examples
type is not something that fall in to the oop concepts, I mean it is not restricted to oop world
Please clear this for me, thanks.
I didn't work with many languages. Maybe, my questions are correct in terms of : Java, C#, Objective-C
1/ I think type is actually data type in some way people talk about it.
2/ No. Both type and class we can define it. An object of Class A has type A. For example if we define String s = "123"; then s has a type String, belong to class String. But the vice versa is not correct.
For example:
class B {}
class A extends B {}
B b = new A();
then you can say b has type B and belong to both class A and B. But b doesn't have type A.
3/ untyped language is a language that allows you to change the type of the variable, like in javascript.
var s = "123"; // type string
s = 123; // then type integer
4/ I don't know much but I think it is not restricted to oop. It can be procedural programming as well
It may well depend on the language. I treat types and classes as the same thing in OO, only making a distinction between class (the definition of a family of objects) and instance (or object), specific concrete occurrences of a class.
I come originally from a C world where there was no real difference between language-defined types like int and types that you made yourself with typedef or struct.
Likewise, in C++, there's little difference (probably none) between std::string and any class you put together yourself, other than the fact that std::string will almost certainly be bug-free by now. The same isn't always necessary in our own code :-)
I've heard people suggest that types are classes without methods but I don't believe that distinction (again because of my C/C++ background).
There is a fundamental difference in some languages between integral (in the sense of integrated rather than integer) types and class types. Classes can be extended but int and float (examples for C++) cannot.
In OOP languages, a class specifies the definition of an object. In many cases, that object can serve as a type for things like parameter matching in a function.
So, for an example, when you define a function, you specify the type of data that should be passed to the function and the type of data that is returned:
int AddOne(int value) { return value+1; } uses int types for the return value and the parameter being passed in.
In languages that have both, the concepts of type and class/object can almost become interchangeable. However, there are many languages that do not have both. For instance, I believe that standard C has no support for custom-defined objects, but it certainly does still have types. On the otherhand, both PHP and Javascript are examples of languages where type is very loosely defined (basically, types are either single item, collection/array/object, or undefined [js only]), but they have full support for classes/objects.
Another key difference: you can have methods and custom-functions associated with a class/object, but not with a standard data-type.
Hopefully that clarified some. To answer your specific questions:
In some ways, type could be considered a base concept of programming, yes.
Yes, with the exception that classes can be treated as types in functions, as in the example above.
An untyped language is one that lets you use any type of variable interchangeably. Meaning that you can handle a string with the same code that handles an int, for instance. In practice most 'untyped' languages actually implement a concept called duck-typing, so named because they say that 'if it acts like a duck, it should be treated like a duck' and attempt to use any variable as the type that makes sense for the code encountered. Again, php and javascript are two languages which do this.
Very true, type is applicable outside of the OOP world.

Non-nullable reference types

I'm designing a language, and I'm wondering if it's reasonable to make reference types non-nullable by default, and use "?" for nullable value and reference types. Are there any problems with this? What would you do about this:
class Foo {
Bar? b;
Bar b2;
Foo() {
b.DoSomething(); //valid, but will cause exception
b2.DoSomething(); //?
}
}
My current language design philosophy is that nullability should be something a programmer is forced to ask for, not given by default on reference types (in this, I agree with Tony Hoare - Google for his recent QCon talk).
On this specific example, with the unnullable b2, it wouldn't even pass static checks: Conservative analysis cannot guarantee that b2 isn't NULL, so the program is not semantically meaningful.
My ethos is simple enough. References are an indirection handle to some resource, which we can traverse to obtain access to that resource. Nullable references are either an indirection handle to a resource, or a notification that the resource is not available, and one is never sure up front which semantics are being used. This gives either a multitude of checks up front (Is it null? No? Yay!), or the inevitable NPE (or equivalent). Most programming resources are, these days, not massively resource constrained or bound to some finite underlying model - null references are, simplistically, one of...
Laziness: "I'll just bung a null in here". Which frankly, I don't have too much sympathy with
Confusion: "I don't know what to put in here yet". Typically also a legacy of older languages, where you had to declare your resource names before you knew what your resources were.
Errors: "It went wrong, here's a NULL". Better error reporting mechanisms are thus essential in a language
A hole: "I know I'll have something soon, give me a placeholder". This has more merit, and we can think of ways to combat this.
Of course, solving each of the cases that NULL current caters for with a better linguistic choice is no small feat, and may add more confusion that it helps. We can always go to immutable resources, so NULL in it's only useful states (error, and hole) isn't much real use. Imperative technqiues are here to stay though, and I'm frankly glad - this makes the search for better solutions in this space worthwhile.
Having reference types be non-nullable by default is the only reasonable choice. We are plagued by languages and runtimes that have screwed this up; you should do the Right Thing.
This feature was in Spec#. They defaulted to nullable references and used ! to indicate non-nullables. This was because they wanted backward compatibility.
In my dream language (of which I'd probably be the only user!) I'd make the same choice as you, non-nullable by default.
I would also make it illegal to use the . operator on a nullable reference (or anything else that would dereference it). How would you use them? You'd have to convert them to non-nullables first. How would you do this? By testing them for null.
In Java and C#, the if statement can only accept a bool test expression. I'd extend it to accept the name of a nullable reference variable:
if (myObj)
{
// in this scope, myObj is non-nullable, so can be used
}
This special syntax would be unsurprising to C/C++ programmers. I'd prefer a special syntax like this to make it clear that we are doing a check that modifies the type of the name myObj within the truth-branch.
I'd add a further bit of sugar:
if (SomeMethodReturningANullable() into anotherObj)
{
// anotherObj is non-nullable, so can be used
}
This just gives the name anotherObj to the result of the expression on the left of the into, so it can be used in the scope where it is valid.
I'd do the same kind of thing for the ?: operator.
string message = GetMessage() into m ? m : "No message available";
Note that string message is non-nullable, but so are the two possible results of the test above, so the assignment is value.
And then maybe a bit of sugar for the presumably common case of substituting a value for null:
string message = GetMessage() or "No message available";
Obviously or would only be validly applied to a nullable type on the left side, and a non-nullable on the right side.
(I'd also have a built-in notion of ownership for instance fields; the compiler would generate the IDisposable.Dispose method automatically, and the ~Destructor syntax would be used to augment Dispose, exactly as in C++/CLI.)
Spec# had another syntactic extension related to non-nullables, due to the problem of ensuring that non-nullables had been initialized correctly during construction:
class SpecSharpExampleClass
{
private string! _nonNullableExampleField;
public SpecSharpExampleClass(string s)
: _nonNullableExampleField(s)
{
}
}
In other words, you have to initialize fields in the same way as you'd call other constructors with base or this - unless of course you initialize them directly next to the field declaration.
Have a look at the Elvis operator proposal for Java 7. This does something similar, in that it encapsulates a null check and method dispatch in one operator, with a specified return value if the object is null. Hence:
String s = mayBeNull?.toString() ?: "null";
checks if the String s is null, and returns the string "null" if so, and the value of the string if not. Food for thought, perhaps.
A couple of examples of similar features in other languages:
boost::optional (C++)
Maybe (Haskell)
There's also Nullable<T> (from C#) but that is not such a good example because of the different treatment of reference vs. value types.
In your example you could add a conditional message send operator, e.g.
b?->DoSomething();
To send a message to b only if it is non-null.
Have the nullability be a configuration setting, enforceable in the authors source code. That way, you will allow people who like nullable objects by default enjoy them in their source code, while allowing those who would like all their objects be non-nullable by default have exactly that. Additionally, provide keywords or other facility to explicitly mark which of your declarations of objects and types can be nullable and which cannot, with something like nullable and not-nullable, to override the global defaults.
For instance
/// "translation unit 1"
#set nullable
{ /// Scope of default override, making all declarations within the scope nullable implicitly
Bar bar; /// Can be null
non-null Foo foo; /// Overriden, cannot be null
nullable FooBar foobar; /// Overriden, can be null, even without the scope definition above
}
/// Same style for opposite
/// ...
/// Top-bottom, until reset by scoped-setting or simply reset to another value
#set nullable;
/// Nullable types implicitly
#clear nullable;
/// Can also use '#set nullable = false' or '#set not-nullable = true'. Ugly, but human mind is a very original, mhm, thing.
Many people argue that giving everyone what they want is impossible, but if you are designing a new language, try new things. Tony Hoare introduced the concept of null in 1965 because he could not resist (his own words), and we are paying for it ever since (also, his own words, the man is regretful of it). Point is, smart, experienced people make mistakes that cost the rest of us, don't take anyones advice on this page as if it were the only truth, including mine. Evaluate and think about it.
I've read many many rants on how it's us poor inexperienced programmers who really don't understand where to really use null and where not, showing us patterns and antipatterns that are meant to prevent shooting ourselves in the foot. All the while, millions of still inexperienced programmers produce more code in languages that allow null. I may be inexperienced, but I know which of my objects don't benefit from being nullable.
Here we are, 13 years later, and C# did it.
And, yes, this is the biggest improvement in languages since Barbara and Stephen invented types in 1974.:
Programming With Abstract Data Types
Barbara Liskov
Massachusetts Institute of Technology
Project MAC
Cambridge, Massachusetts
Stephen Zilles
Cambridge Systems Group
IBM Systems Development Division
Cambridge, Massachusetts
Abstract
The motivation
behind the work in very-high-level languages is to ease the
programming task by providing the programmer with a language
containing primitives or abstractions suitable to his problem area.
The programmer is then able to spend his effort in the right place; he
concentrates on solving his problem, and the resulting program will be
more reliable as a result. Clearly, this is a worthwhile goal.
Unfortunately, it is very difficult for a designer to select in
advance all the abstractions which the users of his language might
need. If a language is to be used at all, it is likely to be used to
solve problems which its designer did not envision, and for which the
abstractions embedded in the language are not sufficient. This paper
presents an approach which allows the set of built-in abstractions to
be augmented when the need for a new data abstraction is discovered.
This approach to the handling of abstraction is an outgrowth of work
on designing a language for structured programming. Relevant aspects
of this language are described, and examples of the use and
definitions of abstractions are given.
I think null values are good: They are a clear indication that you did something wrong. If you fail to initialize a reference somewhere, you'll get an immediate notice.
The alternative would be that values are sometimes initialized to a default value. Logical errors are then a lot more difficult to detect, unless you put detection logic in those default values. This would be the same as just getting a null pointer exception.