Difference in compiler design for OOP languages

Difference in compiler design for OOP languages - oop

I'm doing research on how the design of a compiler for an OOP language differs from traditional imperative languages. I'd just like some topics to send me on my way, and if you wish, you can explain them.
For eg. I found that the type table is built differently.

Before the "compiler design" can be explored, I think the more fundamental question of "language design" needs to be addressed.
Should the language be statically typed? Dynamically typed? Early/late bound or a combination? Supporting generics? Is inference a goal? Should types be closed or open? How should sub-typing work? (Should implicit sub-typing be allowed at all?) Covariance? Contravariance? Single-inheritance? MI? SI with Traits? Explicit memberwise-selection? Prototypal (That is, should there even be a notion of "class" and "instance"?) Should types in nominative or based off of member signatures? Single-dispatch or multiple-dispatch? Are members invoked as first-class citizens or message-passing? Are types the same as classes? Is there a distinction between "value" and "reference" types? Etc, etc, etc... and this is just the tip of a very large iceberg.

Related

Concept of First-Classed Type in Idris

When learning Idris with Edwin's Type-driven Development with Idris, I read about the unique property of Idris that it's type is a first class construct, compared to other programming languages, and especially to those who also have a dependent type system.
In the book it talks about the potential usage of such feature: database schema, network protocol description and .etc.
With these benefits, my question is two-fold:
Is it not possible to do these tasks in a merely dependently typed language that has not first-class type?
What is the negative point of such feature? Why are types in other systems/languages not first class (in Agda or Coq for example)? I assume that this feature introduces some theoretical limitations on what the program can check at compile time, but I don't know what it is.
Or more concretely:
What are the examples where first-class type can be distinguished from mere dependent type?
I would like to see both some positive examples (benefits) and some negative examples (that it makes compiler difficult to do something, or some other kinds of limitations).

"First-class type" in the Idris book means precisely "dependent type" in the sense of Agda or Coq, so there is no distinction here at all.
GADTs in Haskell and OCaml could be viewed as a form of dependent types which does not entail first-class types in the sense of Idris. Here, there are two different programming languages, for value-level and type-level programming, and types cannot be arbitrarily mixed with values. GADTs allow dependent pattern matching, where in different branches we can learn information about type-level values. But we can't learn information about runtime values.

Is interface ever useful for a OO language which supports multiple inheritance?

I heard that interface is introduced as a way for making up that a object-oriented language doesn't support multiple inheritance but only single inheritance.
Is interface merely used for that purpose?
Is interface ever useful for a OO language which supports multiple inheritance?
Thanks.

The book "Design Patterns" strongly stresses the importance of interfaces and at the time it was written, C++ (with multiple inheritance) was the most popular OO language and Java didn't even exist yet. (The book was published a year before Java was released.)
It's important to understand the difference between an object's class and its type.
An object's class defines how the object is implemented... In contrast, an object's type only refers to its interface—the set of requests to which it can respond.
...
It's easy to confuse these two concepts, because many languages don't make the distinction explicit.
...
Many of the design patterns depend on this distinction.
This book coined the term "Program to an Interface, not an Implementation."

Is my understanding of abstraction correct?

I've read the other posts discussing abstraction and encapsulation, but I'm not confident I understand them; or maybe I understand them but feel unsatisfied with the clarity of their content. Here are my understandings of abstraction and encapsulation. In what regards are they accurate/inaccurate/complete/incomplete?
"Abstractions are data types created by programmers to extend a language when primitive data types are insufficient. Like primitive data types, abstractions have specifications which list the inputs they require and the outputs they return, but the specifications do not overwhelm programmers with the methods, functions, and variables used to operate on the inputs. A class is an example of an abstraction. An API is another example of an abstraction."
"Encapsulation is the state of having abstract data types — i.e. classes — isolated from each other so their methods, functions, and variables do not conflict with each other, and so programmers can easily reuse an existing class in other programs without being concerned that doing so would interfere with the rest of the program (presuming the programmer correctly provides the required inputs and correctly handles the data that get returns)."

I prefer Robert C. Martin's definition in APPP:
Abstraction is the elimination of the irrelevant and the amplification of the essential.

I'd say your understanding is correct ... so much, so, that I hesitate to comment more specifically.
However, if I were to comment, I might say that "Data types can be used to implement abstractions ...", rather than "Abstractions are data types ...", since abstractions can exist outside of software (it hurt me to say that :-).
But that's just nitpicking. I think you understand. I hope I do, after 36 years of coding ... mostly in languages that support reasonable levels of abstraction (PL/1, Pascal, C, C++, Java).
There are a lot of nice intelligent people in industry, though, who have no concept of abstraction in software, and consider it pretentiously high brow.
Personally, I think that good clear misnomer-free abstraction is a key technical ingredient of solid software engineering.

I've never come across that definition of encapsulation before. That definition sounds more like what namespaces are for. I've always read about encapsulation being purely about the ability to restrict access to certain components of your code, such as access modifiers in OOP languages. However, there seems to be a two definitions of encapsulation on wikipedia, which is news to me:
Encapsulation is the packing of data and functions into a single
component. The features of encapsulation are supported using classes
in most object-oriented programming languages, although other
alternatives also exist. It allows selective hiding of properties and
methods in an object by building an impenetrable wall to protect the
code from accidental corruption.
In programming languages, encapsulation is used to refer to one of two
related but distinct notions, and sometimes to the combination
thereof:
A language mechanism for restricting access to some of the object's
components.
A language construct that facilitates the bundling
of data with the methods (or other functions) operating on that
data.
Some programming language researchers and academics use
the first meaning alone or in combination with the second as a
distinguishing feature of object-oriented programming, while other
programming languages which provide lexical closures view
encapsulation as a feature of the language orthogonal to object
orientation.
The second definition is motivated by the fact that in many OOP
languages hiding of components is not automatic or can be overridden;
thus, information hiding is defined as a separate notion by those who
prefer the second definition.
source
So, I guess I've always defined encapsulation in terms of point #1, but it looks like some people define it as the ability to bundle methods and data together, and term #1 "information hiding".

Categories feature of objective-c comes under which OOPS feature?

As per my knowledge, Objective-C is an Object oriented programming languge and Categories is a feature provided by Objective-C.
So I would like to know that Category feature is coming under which OOPs concept
Abstraction
Polymorphism
Encapsulation
Inheritance, etc.
Thanks in advance.
Mrunal

#Abizern's answer is good. I would add that categories are a form of dynamic dispatch, in particular that they can be used to extend existing classes without subclassing.
That said, Object Oriented Programming is more a design philosophy than a set of language features. One might ask "what OOP feature does postfix increment correspond to?" The answer is "none; it's a language feature." Categories are not primarily used to implement OOP design (though sometimes they are, as noted above). Their original use was to break up large implementation files. Their later use was to provide informal protocols due to a flaw in the language (lack of #optional). And today, they're primarily used to split code along platform-specific lines (NSString+UIStringDrawing vs NSString+AppKitAdditions).
Extensions are similar to categories, and similarly are primarily a language feature rather than an OOP design feature. They facilitate encapsulation to some extent, but mostly are a side-effect of an arbitrary compiler requirement to define methods before they are used (I say "arbitrary" because this is not related to design or developer needs; it just simplifies the compiler). Extensions should not be confused with some deep OOP requirement.
So using categories to attach additional functionality at runtime is dynamic dispatch. Beyond that, it's just a language feature that's used for several non-OOP things.

According to the Cocoa Design Patterns book by Buck and Yacktman. Category is a pattern in itself, and one that is supported by the Objective-C programming language directly.

Probably closest to Encapsulation, when it comes to academic OOP concepts. But this really shows the difference between academic definitions of OOP, and the practical world of OOP Design Patterns.

Why avoid subtyping?

I have seen many people in the Scala community advise on avoiding subtyping "like a plague". What are the various reasons against the use of subtyping? What are the alternatives?

Types determine the granularity of composition, i.e. of extensibility.
For example, an interface, e.g. Comparable, that combines (thus conflates) equality and relational operators. Thus it is impossible to compose on just one of the equality or relational interface.
In general, the substitution principle of inheritance is undecidable. Russell's paradox implies that any set that is extensible (i.e. does not enumerate the type of every possible member or subtype), can include itself, i.e. is a subtype of itself. But in order to identify (decide) what is a subtype and not itself, the invariants of itself must be completely enumerated, thus it is no longer extensible. This is the paradox that subtyped extensibility makes inheritance undecidable. This paradox must exist, else knowledge would be static and thus knowledge formation wouldn't exist.
Function composition is the surjective substitution of subtyping, because the input of a function can be substituted for its output, i.e. any where the output type is expected, the input type can be substituted, by wrapping it in the function call. But composition does not make the bijective contract of subtyping-- accessing the interface of the output of a function, does not access the input instance of the function.
Thus composition does not have to maintain the future (i.e. unbounded) invariants and thus can be both extensible and decidable. Subtyping can be MUCH more powerful where it is provably decidable, because it maintains this bijective contract, e.g. a function that sorts a immutable list of the supertype, can operate on the immutable list of the subtype.
So the conclusion is to enumerate all the invariants of each type (i.e. of its interfaces), make these types orthogonal (maximize granularity of composition), and then use function composition to accomplish extension where those invariants would not be orthogonal. Thus a subtype is appropriate only where it provably models the invariants of the supertype interface, and the additional interface(s) of the subtype are provably orthogonal to the invariants of the supertype interface. Thus the invariants of interfaces should be orthogonal.
Category theory provides rules for the model of the invariants of each subtype, i.e. of Functor, Applicative, and Monad, which preserve function composition on lifted types, i.e. see the aforementioned example of the power of subtyping for lists.

One reason is that equals() is very hard to get right when sub-typing is involved. See How to Write an Equality Method in Java. Specifically "Pitfall #4: Failing to define equals as an equivalence relation". In essence: to get equality right under sub-typing, you need a double dispatch.

I think the general context is for the lanaguage to be as "pure" as possible (ie using as much as possible pure functions), and comes from the comparison with Haskell.
From "Ruminations of a Programmer"
Scala, being a hybrid OO-FP language has to take care of issues like subtyping (which Haskell does not have).
As mentioned in this PSE answer:
no way to restrict a subtype so that it can't do more than the type it inherits from.
For example, if the base class is immutable and defines a pure method foo(...), derived classes must not be mutable or override foo() with a function that is not pure
But the actual recommendation would be to use the best solution adapted to the program you are currently developing.

Focusing on subtyping, ignoring the issues related to classes, inheritance, OOP, etc.. We have the idea subtyping represents a isa relation between types. For example, types A and B have different operations but if A isa B we then can use any of B's operations on an A.
OTOH, using another traditional relation, if C hasa B then we can reuse any of B's operations on a C. Usually languages let you write one with a nicer syntax, a.opOnB instead of a.super.opOnB as it would be in the case of composition, c.b.opOnB
The problem is that in many cases there's more than one way to relate two types. For example Real can be embedded in Complex assuming 0 on the imaginary part, but Complex can be embedded in Real by ignoring the imaginary part, so both can be seen as subtypes of the other and subtyping forces one relation to be viewed as preferred. Also, there are more possible relations (e.g. view Complex as a Real using theta component of polar representation).
In formal terminology we usually say morphism to such relations between types and there are special kinds of morphisms for relations with different properties (e.g. isomorphism, homomorphism).
In a language with subtyping usually there's much more sugar on isa relations and given many possible embeddings we tend to see unnecessary friction whenever we're using the unpreferred relation. If we bring inheritance, classes and OOP to the mix the problem becomes much more visible and messy.

My answer does not answer why it is avoided but tries to give another hint at why it can be avoided.
Using "type classes" you can add an abstraction over existing types/classes without modifying them. Inheritance is used to express that some classes are specializations of a more abstract class. But with type classes you can take any existing classes and express that they all share a common property, for example they are Comparable. And as long as you are not concerned with them being Comparable you don't even notice it. The classes don't inherit any methods from some abstract Comparable type as long as you don't use them. It's a bit like programming in dynamic languages.
Further reads:
http://blog.tmorris.net/the-power-of-type-classes-with-scala-implicit-defs/
http://debasishg.blogspot.com/2010/07/refactoring-into-scala-type-classes.html

I don't know Scala, but I think the mantra 'prefer composition over inheritance' applies for Scala exactly the way it does for every other OO programming language (and subtyping is often used with the same meaning as 'inheritance'). Here
Prefer composition over inheritance?
you will find some more information.

I think lots of Scala programmers are former Java programmers. They are used to think in term of Object Oriented subtyping and they should be able to easily find OO-like solution for most problems. But Functional Programing is a new paradigm to discover, so people ask for a different kind of solutions.

This is the best paper I have found on the subject. A motivating quote from the paper –
We argue that while some of the simpler aspects of object-oriented languages are
compatible with ML, adding a full-fledged class-based object system to ML leads to an excessively complex
type system and relatively little expressive gain

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas