Why does the root object implement equality? - oop

In many object oriented languages such as Java, the .NET family, Python, Ruby, and I'm sure a host of others, the root object class from which all other classes inherit defines an equality checking method. However, in my experience, many of the classes I create really don't need an equality check or I (or co-workers) don't bother overriding the default method because we don't intend to use it. In the latter case, the default equality method doesn't represent equality very well for that class. So why do so many languages provide this method as part of the definition of the root object class when it seems like many classes should not? Why not leave off the equality method and force users to define it when they need it?

For any objects references X and Y, regardless of their types, it is possible to meaningfully ask and answer the question "Is the object referred to by X equivalent to the one referred to by Y". If X represents a Porche 911 automobile and Y represents a cast-iron park bench, the answer would simply be "no". To be sure, if one knew that X was a car and Y a bench, one likely wouldn't bother asking, but suppose X, Y, or both, were "things that one may be asked to paint". One might not know whether or not X and Y are of the same type, and the objects are not equivalent, one may not care. Having a universal means of asking equivalence saves code the trouble of having to worry about objects' exact type.
The reason to have all objects implement Equals as a virtual method is that it's the easiest mechanism by which objects can supply a definition of equivalence that is broader than referential equality. It is often useful to have immutable objects report themselves as equivalent to other objects which have the same immutable state [e.g. having two strings, both holding the six characters "GEORGE", reported each other as equivalent] Having all objects implement Equals as a virtual method, and having mutable objects' implementation simply report referential equality, is generally easier than having an Equals function which can only be used on immutable objects. After all, it's not hard for an mutable object to simply report itself as unequal to anything other than itself.

Related

Hacklang : why were container classes replaced with built-in types?

Just a quote from hack documentation :
Legacy Vector, Map, and Set
These container types should be avoided in new code; use dict,
keyset, and vec instead.
Early in Hack's life, the library provided mutable and immutable
generic class types called: Vector, ImmVector, Map, ImmMap, Set, and
ImmSet. However, these have been replaced by vec, dict, and keyset,
whose use is recommended in all new code. Each generic type had a
corresponding literal form. For example, a variable of type
Vector might be initialized using Vector {22, 33, $v}, where $v
is a variable of type int.
I wonder why this change was made.
I mean, one of PHP weaknesses is that it has bad oop standard library.
Ex : str_replace and array_values methods are outside of the string/array type itself. The PHP standard library is not consistent, sometimes we must pass the array as the first parameter, other times as the second...
I was glad to see that Hack introduced true OOP encapsulation for collections.
Do you know why they stepped back and wrote utility classes such as C\, Dict\, Keyset\ and Vec\ ?
Will there be in the future an addition to add methods to built-in types (ex : Str\starts_with => "toto"->startsWith("t")) ?
Based on Dwayne Reeves' blog post introducing HSL, it seems that the main advantage is the fact that arrays are native values, not objects. This has two important consequences:
For users, the semantics are different when the values cross through arguments. Objects are passed as references, and mutations affect the original object. On the other hand, values are copied on write after passing through arguments, so without references (which are finally to be completely banned in Hack) the callee can't mutate the value of the caller, with the exception of the much stricter inout parameters.
The article cites the invariance of the mutable containers (Vector, Set, etc.) and generally how shared mutable state couples functions closer together. The soundness issues as discussed in the article are somewhat moot because there were also immutable object containers (ImmVector, ImmSet, etc.), although since these interfaces were written in userland, variance boxed the function type signature into tight constraints. There are tangible differences from this: ImmMap<Tk, +Tv> is invariant in Tk solely because of the (function(Tk): Tv) getter. Meanwhile, dict<+Tk, +Tv> is covariant in both type parameters thanks to the inherent mutation protection from copy-on-write.
For the compiler, static values can be allocated quickly and persist over the lifetime of the server. Objects on the other hand have arbitrarily complicated construction routines in general, and the collection objects weren't going to be special-cased it seems.
I will also mention that for most use cases, there is minimal difference even in code style: e.g. the -> reference chains can be directly replaced with the |> pipe operator. There is also no longer a boundary between the privileged "standard functions" and custom user functions on collection types. Finally, the collection types were final of course, so their objective nature didn't offer any actual hierarchical or polymorphic advantages to the end user anyways.

Multiple Dispatch: A conceptual necessity?

I wonder if the concept of multiple dispatch (that is, built-in support, as if the dynamic dispatch of virtual methods is extended to the method's arguments as well) should be included in an object-oriented language if its impact on performance would be negligible.
Problem
Consider the following scenario: I have a -- not necessarily flat -- class hierarchy containing types of animals. At different locations in my code, I want to perform some actions on an animal object. I do not care, nor can I control, how this object reference is obtained. I might encounter it by traversing a list of animals, or it might be given to me as one of a method's arguments. The action I want to perform should be specialized depending on the runtime type of the given animal. Examples of such actions would be:
Construct a view-model for the animal in order to present it in the GUI.
Construct a data object (to later store into the DB) representing this type of animal.
Feed the animal with some food, but give different kinds of food depending on the type of the animal (what is more healthy for it)
All of these examples operate on the public API of an animal object, but what they do is not the animal's own business, and therefore cannot be put into the animal itself.
Solutions
One "solution" would be to perform type checks. But this approach is error-prone and uses reflective features, which (in my opinion) is almost always an indication of bad design. Types should be a compile-time concept only.
Another solution would be to "abuse" (sort of) the visitor pattern to mimic double dispatch. But this would require that I change my animals to accept a visitor.
I am sure there are other approaches. Also, the problem of extension should be addressed: If new types of animals join the party, how many code locations need to be adapted, and how can I find them reliably?
The Question
So, in the light of these requirements, shouldn't multiple dispatch be an integral part of any well-designed object-oriented language?
Isn't it natural to make external (not just internal) actions dependent on the dynamic type of a given object?
Best regards!
You are suggesting dynamic dispatching based on method name / signature combined with runtime actual argument types. I think you're crazy.
So, in the light of these requirements, shouldn't multiple dispatch be an integral part of any well-designed object-oriented language?
That there are problems for which the availability of the kind of dispatch strategy you envision would simplify coding is a weak argument for such dispatch being built into any given language, much less every OO language.
Isn't it natural to make external (not just internal) actions dependent on the dynamic type of a given object?
Perhaps, but not everything that seems "natural" is in fact a good idea. Clothes are not natural, for instance, but see what happens if you try going around in public without (somewhere other than Berkeley, anyway).
Some languages already have static dispatch based on argument types, more conventionally called "overloading". Dynamic dispatch based on argument types, on the other hand, is a real mess if there is more than one argument to be considered, and it cannot help but be slow(er). Today's popular OO languages provide for you to perform double dispatch where it is wanted, without the overhead of supporting it in the vast majority of places where you don't want it.
Furthermore, although implementing double-dispatch does present maintenance issues arising from tight coupling between separate components, there are coding strategies that can help keep that manageable. It is anyway unclear to what extent having argument-based multiple dispatch built in to a given language would actually help with that problem.
One "solution" would be to perform type checks. But this approach is
error-prone and uses reflective features, which (in my opinion) is
almost always an indication of bad design. Types should be a
compile-time concept only.
You're wrong. All uses of virtual functions, virtual inheritance, and such things involve reflective features and dynamic types. The ability to defer typing until runtime when you need to is absolutely critical and is inherent in even the most basic formulation of the situation you're in, which literally cannot even arise without the use of dynamic types. You even describe your problem as wanting to do different things depending on.. the dynamic type. After all, if there is no dynamic typing, why would you need to do things differently? You already know the concrete final type.
Of course, a bit of run-time typing can handle the problem you got yourself into with run-time typing.
Simply build a dictionary/hash table from type to function. You can add entries to this structure dynamically for any dynamically linked derived types, it's a nice O(1) to look up into, and requires no internal support.
If one restricts oneself to the situation where knowledge of how an object of type X should fnorble an object of type Y must be stored in either class X or class Y, one can have the base type of Y include a method that accepts a reference of X's base type and indicates how much an object knows about how to be fnorbled by the object identified by that reference, as well as a method that asks the Y to have an X fnorble it.
Having done that, one can have X's Fnorble(Y) method start by asking the Y how much it knows about being fnorbled by a particular type of X. If the Y knows more about X than X knows about Y, then X's Fnorble(Y) method should call the Y's BeFnorbledBy(X) method; otherwise, the X should fnorble the Y as best it knows how.
Depending upon how many different kinds of X and Y there are, Y could define BeFnorbledBy overloads methods for different kinds of X, such that when X calls target.BeFnorbledBy(this) it would automatically dispatch directly to a suitable method; such an approach, however, would require every Y to know about every type of X that was "interesting" to anybody whether or not it had any interest that particular type itself.
Note that this approach doesn't accommodate the situation where there might be an outside object of class Z which knows things about how an X should fnorble a Y that neither X nor Y knows directly. That kinds of situation is best handled by having a "rulebook" object where everything that knows about how various kinds of X should fnorble various kinds of Y can tell the rulebook, and code which wants an X to fnorble a Y can ask the rulebook to make that happen. Although languages could provide assistance in cases where rulebooks are singletons, there may be times when it may be useful to have multiple rulebooks. The semantics in those cases are probably best handled by having code use rulebooks directly.

Why avoid subtyping?

I have seen many people in the Scala community advise on avoiding subtyping "like a plague". What are the various reasons against the use of subtyping? What are the alternatives?
Types determine the granularity of composition, i.e. of extensibility.
For example, an interface, e.g. Comparable, that combines (thus conflates) equality and relational operators. Thus it is impossible to compose on just one of the equality or relational interface.
In general, the substitution principle of inheritance is undecidable. Russell's paradox implies that any set that is extensible (i.e. does not enumerate the type of every possible member or subtype), can include itself, i.e. is a subtype of itself. But in order to identify (decide) what is a subtype and not itself, the invariants of itself must be completely enumerated, thus it is no longer extensible. This is the paradox that subtyped extensibility makes inheritance undecidable. This paradox must exist, else knowledge would be static and thus knowledge formation wouldn't exist.
Function composition is the surjective substitution of subtyping, because the input of a function can be substituted for its output, i.e. any where the output type is expected, the input type can be substituted, by wrapping it in the function call. But composition does not make the bijective contract of subtyping-- accessing the interface of the output of a function, does not access the input instance of the function.
Thus composition does not have to maintain the future (i.e. unbounded) invariants and thus can be both extensible and decidable. Subtyping can be MUCH more powerful where it is provably decidable, because it maintains this bijective contract, e.g. a function that sorts a immutable list of the supertype, can operate on the immutable list of the subtype.
So the conclusion is to enumerate all the invariants of each type (i.e. of its interfaces), make these types orthogonal (maximize granularity of composition), and then use function composition to accomplish extension where those invariants would not be orthogonal. Thus a subtype is appropriate only where it provably models the invariants of the supertype interface, and the additional interface(s) of the subtype are provably orthogonal to the invariants of the supertype interface. Thus the invariants of interfaces should be orthogonal.
Category theory provides rules for the model of the invariants of each subtype, i.e. of Functor, Applicative, and Monad, which preserve function composition on lifted types, i.e. see the aforementioned example of the power of subtyping for lists.
One reason is that equals() is very hard to get right when sub-typing is involved. See How to Write an Equality Method in Java. Specifically "Pitfall #4: Failing to define equals as an equivalence relation". In essence: to get equality right under sub-typing, you need a double dispatch.
I think the general context is for the lanaguage to be as "pure" as possible (ie using as much as possible pure functions), and comes from the comparison with Haskell.
From "Ruminations of a Programmer"
Scala, being a hybrid OO-FP language has to take care of issues like subtyping (which Haskell does not have).
As mentioned in this PSE answer:
no way to restrict a subtype so that it can't do more than the type it inherits from.
For example, if the base class is immutable and defines a pure method foo(...), derived classes must not be mutable or override foo() with a function that is not pure
But the actual recommendation would be to use the best solution adapted to the program you are currently developing.
Focusing on subtyping, ignoring the issues related to classes, inheritance, OOP, etc.. We have the idea subtyping represents a isa relation between types. For example, types A and B have different operations but if A isa B we then can use any of B's operations on an A.
OTOH, using another traditional relation, if C hasa B then we can reuse any of B's operations on a C. Usually languages let you write one with a nicer syntax, a.opOnB instead of a.super.opOnB as it would be in the case of composition, c.b.opOnB
The problem is that in many cases there's more than one way to relate two types. For example Real can be embedded in Complex assuming 0 on the imaginary part, but Complex can be embedded in Real by ignoring the imaginary part, so both can be seen as subtypes of the other and subtyping forces one relation to be viewed as preferred. Also, there are more possible relations (e.g. view Complex as a Real using theta component of polar representation).
In formal terminology we usually say morphism to such relations between types and there are special kinds of morphisms for relations with different properties (e.g. isomorphism, homomorphism).
In a language with subtyping usually there's much more sugar on isa relations and given many possible embeddings we tend to see unnecessary friction whenever we're using the unpreferred relation. If we bring inheritance, classes and OOP to the mix the problem becomes much more visible and messy.
My answer does not answer why it is avoided but tries to give another hint at why it can be avoided.
Using "type classes" you can add an abstraction over existing types/classes without modifying them. Inheritance is used to express that some classes are specializations of a more abstract class. But with type classes you can take any existing classes and express that they all share a common property, for example they are Comparable. And as long as you are not concerned with them being Comparable you don't even notice it. The classes don't inherit any methods from some abstract Comparable type as long as you don't use them. It's a bit like programming in dynamic languages.
Further reads:
http://blog.tmorris.net/the-power-of-type-classes-with-scala-implicit-defs/
http://debasishg.blogspot.com/2010/07/refactoring-into-scala-type-classes.html
I don't know Scala, but I think the mantra 'prefer composition over inheritance' applies for Scala exactly the way it does for every other OO programming language (and subtyping is often used with the same meaning as 'inheritance'). Here
Prefer composition over inheritance?
you will find some more information.
I think lots of Scala programmers are former Java programmers. They are used to think in term of Object Oriented subtyping and they should be able to easily find OO-like solution for most problems. But Functional Programing is a new paradigm to discover, so people ask for a different kind of solutions.
This is the best paper I have found on the subject. A motivating quote from the paper –
We argue that while some of the simpler aspects of object-oriented languages are
compatible with ML, adding a full-fledged class-based object system to ML leads to an excessively complex
type system and relatively little expressive gain

type vs. interface: why typing then?

By broadening my horizons with javascript together with my python experience I put some thought.
What's the purpose of type if the vision of an entity to an external client is via its interface ?
In static typed languages, the type has a very strong, central importance. Type and interface are strictly associated. For example, in java when you declare an interface FooIface and an object implement that interface, you cannot use it in a context requiring BarIface, even if the two are exactly the same in terms of methods, signatures and exceptions.
Not so in python. Even if two objects have completely different and unrelated types, as long as their interface is the same, they are totally and transparently interchangeable. If it quacks and walks like a duck, it's a duck. I can completely change the nature of an object by completely altering its interface at runtime, but it will preserve the original type.
This point of view is put to the extreme in javascript, where every object in any prototype chain is just that, an object. you ask the type of each object in javascript, and it will tell you just that, it's an object.
It appears to me that the concept of type for these languages is at the limit of futility. What is then really vital for ? Does type has a real meaning in dynamic typed languages?
I don't see futility. Consider:
1). When you construct an object
var myThing = new Thing( ... );
the Thing's type has significance.
2). A method of Thing can use
this.aProperty
according to its knowledge of type
3). You can use instanceof to determine a type
I tend to understand the word "type" as equivalent to the word "class" in Python. That's only 99% correct, but close enough.
>>> type(object)
<type 'type'>
>>> class X(object):
... pass
...
>>> type(X)
<type 'type'>
>>> type(_)
<type 'type'>
>>> type(X())
<class '__main__.X'>
>>> type(X) is type(type)
True
I do, however, usually avoid the word "type" in this case. In general, my way of seeing it is this: The word "type" implies that the thing in question is not a first-class object.
If a language only regards the implemented methods of a class to infer which interfaces it conforms to, you can implement an interface by accident. Lets say in Java you have the interfaces IA and IB which both define the method long getRemainingTime. In this case the contract of these interfaces would specify, what kind of format they would return (one could return the time in seconds, while the other returns the time in milliseconds). Also the context in which these interfaces are used can be very different. Lets say they are not called IA and IB but IProgress and IStopWatch. In such a case, the returned time would have very different meanings. If you were able to interchange these two interfaces as you liked, you might get really unexpected results.
In general the type can be seen as an aid to perform a rudimentary, static code analysis. If you implement a specific interface, the compiler can then tell you directly that you are probably making a mistake, if you try to pass an instance of your implementation to a method which expects a similar implementation but of a different type.

What do you call a method of an object that changes its class?

Let's say you have a Person object and it has a method on it, promote(), that transforms it into a Captain object. What do you call this type of method/interaction?
It also feels like an inversion of:
myCaptain = new Captain(myPerson);
Edit: Thanks to all the replies. The reason I'm coming across this pattern (in Perl, but relevant anywhere) is purely for convenience. Without knowing any implementation deals, you could say the Captain class "has a" Person (I realize this may not be the best example, but be assured it isn't a subclass).
Implementation I assumed:
// this definition only matches example A
Person.promote() {
return new Captain(this)
}
personable = new Person;
// A. this is what i'm actually coding
myCaptain = personable.promote();
// B. this is what my original post was implying
personable.promote(); // is magically now a captain?
So, literally, it's just a convenience method for the construction of a Captain. I was merely wondering if this pattern has been seen in the wild and if it had a name. And I guess yeah, it doesn't really change the class so much as it returns a different one. But it theoretically could, since I don't really care about the original.
Ken++, I like how you point out a use case. Sometimes it really would be awesome to change something in place, in say, a memory sensitive environment.
A method of an object shouldn't change its class. You should either have a member which returns a new instance:
myCaptain = myPerson->ToCaptain();
Or use a constructor, as in your example:
myCaptain = new Captain(myPerson);
I would call it a conversion, or even a cast, depending on how you use the object. If you have a value object:
Person person;
You can use the constructor method to implicitly cast:
Captain captain = person;
(This is assuming C++.)
A simpler solution might be making rank a property of person. I don't know your data structure or requirements, but if you need to something that is trying to break the basics of a language its likely that there is a better way to do it.
You might want to consider the "State Pattern", also sometimes called the "Objects for States" pattern. It is defined in the book Design Patterns, but you could easily find a lot about it on Google.
A characteristic of the pattern is that "the object will appear to change its class."
Here are some links:
Objects for States
Pattern: State
Everybody seems to be assuming a C++/Java-like object system, possibly because of the syntax used in the question, but it is quite possible to change the class of an instance at runtime in other languages.
Lisp's CLOS allows changing the class of an instance at any time, and it's a well-defined and efficient transformation. (The terminology and structure is slightly different: methods don't "belong" to classes in CLOS.)
I've never heard a name for this specific type of transformation, though. The function which does this is simply called change-class.
Richard Gabriel seems to call it the "change-class protocol", after Kiczales' AMOP, which formalized as "protocols" many of the internals of CLOS for metaprogramming.
People wonder why you'd want to do this; I see two big advantages over simply creating a new instance:
faster: changing class can be as simple as updating a pointer, and updating any slots that differ; if the classes are very similar, this can be done with no new memory allocations
simpler: if a dozen places already have a reference to the old object, creating a new instance won't change what they point to; if you need to update each one yourself, that could add a lot of complexity for what should be a simple operation (2 words, in Lisp)
That's not to say it's always the right answer, but it's nice to have the ability to do this when you want it. "Change an instance's class" and "make a new instance that's similar to that one" are very different operations, and I like being able to say exactly what I mean.
The first interesting part would be to know: why do you want/need an object changes its class at runtime?
There are various options:
You want it to respond differently to some methods for a given state of the application.
You might want it to have new functionality that the original class don't have.
Others...
Statically typed languages such as Java and C# don't allow this to happen, because the type of the object should be know at compile time.
Other programming languages such as Python and Ruby may allow this ( I don't know for sure, but I know they can add methods at runtime )
For the first option, the answer given by Charlie Flowers is correct, using the state patterns would allow a class behave differently but the object will have the same interface.
For the second option, you would need to change the object type anyway and assign it to a new reference with the extra functionality. So you will need to create another distinct object and you'll end up with two different objects.