Concept of First-Classed Type in Idris - language-design

When learning Idris with Edwin's Type-driven Development with Idris, I read about the unique property of Idris that it's type is a first class construct, compared to other programming languages, and especially to those who also have a dependent type system.
In the book it talks about the potential usage of such feature: database schema, network protocol description and .etc.
With these benefits, my question is two-fold:
Is it not possible to do these tasks in a merely dependently typed language that has not first-class type?
What is the negative point of such feature? Why are types in other systems/languages not first class (in Agda or Coq for example)? I assume that this feature introduces some theoretical limitations on what the program can check at compile time, but I don't know what it is.
Or more concretely:
What are the examples where first-class type can be distinguished from mere dependent type?
I would like to see both some positive examples (benefits) and some negative examples (that it makes compiler difficult to do something, or some other kinds of limitations).

"First-class type" in the Idris book means precisely "dependent type" in the sense of Agda or Coq, so there is no distinction here at all.
GADTs in Haskell and OCaml could be viewed as a form of dependent types which does not entail first-class types in the sense of Idris. Here, there are two different programming languages, for value-level and type-level programming, and types cannot be arbitrarily mixed with values. GADTs allow dependent pattern matching, where in different branches we can learn information about type-level values. But we can't learn information about runtime values.

Related

Is my understanding of abstraction correct?

I've read the other posts discussing abstraction and encapsulation, but I'm not confident I understand them; or maybe I understand them but feel unsatisfied with the clarity of their content. Here are my understandings of abstraction and encapsulation. In what regards are they accurate/inaccurate/complete/incomplete?
"Abstractions are data types created by programmers to extend a language when primitive data types are insufficient. Like primitive data types, abstractions have specifications which list the inputs they require and the outputs they return, but the specifications do not overwhelm programmers with the methods, functions, and variables used to operate on the inputs. A class is an example of an abstraction. An API is another example of an abstraction."
"Encapsulation is the state of having abstract data types — i.e. classes — isolated from each other so their methods, functions, and variables do not conflict with each other, and so programmers can easily reuse an existing class in other programs without being concerned that doing so would interfere with the rest of the program (presuming the programmer correctly provides the required inputs and correctly handles the data that get returns)."
I prefer Robert C. Martin's definition in APPP:
Abstraction is the elimination of the irrelevant and the amplification of the essential.
I'd say your understanding is correct ... so much, so, that I hesitate to comment more specifically.
However, if I were to comment, I might say that "Data types can be used to implement abstractions ...", rather than "Abstractions are data types ...", since abstractions can exist outside of software (it hurt me to say that :-).
But that's just nitpicking. I think you understand. I hope I do, after 36 years of coding ... mostly in languages that support reasonable levels of abstraction (PL/1, Pascal, C, C++, Java).
There are a lot of nice intelligent people in industry, though, who have no concept of abstraction in software, and consider it pretentiously high brow.
Personally, I think that good clear misnomer-free abstraction is a key technical ingredient of solid software engineering.
I've never come across that definition of encapsulation before. That definition sounds more like what namespaces are for. I've always read about encapsulation being purely about the ability to restrict access to certain components of your code, such as access modifiers in OOP languages. However, there seems to be a two definitions of encapsulation on wikipedia, which is news to me:
Encapsulation is the packing of data and functions into a single
component. The features of encapsulation are supported using classes
in most object-oriented programming languages, although other
alternatives also exist. It allows selective hiding of properties and
methods in an object by building an impenetrable wall to protect the
code from accidental corruption.
In programming languages, encapsulation is used to refer to one of two
related but distinct notions, and sometimes to the combination
thereof:
A language mechanism for restricting access to some of the object's
components.
A language construct that facilitates the bundling
of data with the methods (or other functions) operating on that
data.
Some programming language researchers and academics use
the first meaning alone or in combination with the second as a
distinguishing feature of object-oriented programming, while other
programming languages which provide lexical closures view
encapsulation as a feature of the language orthogonal to object
orientation.
The second definition is motivated by the fact that in many OOP
languages hiding of components is not automatic or can be overridden;
thus, information hiding is defined as a separate notion by those who
prefer the second definition.
source
So, I guess I've always defined encapsulation in terms of point #1, but it looks like some people define it as the ability to bundle methods and data together, and term #1 "information hiding".

Why avoid subtyping?

I have seen many people in the Scala community advise on avoiding subtyping "like a plague". What are the various reasons against the use of subtyping? What are the alternatives?
Types determine the granularity of composition, i.e. of extensibility.
For example, an interface, e.g. Comparable, that combines (thus conflates) equality and relational operators. Thus it is impossible to compose on just one of the equality or relational interface.
In general, the substitution principle of inheritance is undecidable. Russell's paradox implies that any set that is extensible (i.e. does not enumerate the type of every possible member or subtype), can include itself, i.e. is a subtype of itself. But in order to identify (decide) what is a subtype and not itself, the invariants of itself must be completely enumerated, thus it is no longer extensible. This is the paradox that subtyped extensibility makes inheritance undecidable. This paradox must exist, else knowledge would be static and thus knowledge formation wouldn't exist.
Function composition is the surjective substitution of subtyping, because the input of a function can be substituted for its output, i.e. any where the output type is expected, the input type can be substituted, by wrapping it in the function call. But composition does not make the bijective contract of subtyping-- accessing the interface of the output of a function, does not access the input instance of the function.
Thus composition does not have to maintain the future (i.e. unbounded) invariants and thus can be both extensible and decidable. Subtyping can be MUCH more powerful where it is provably decidable, because it maintains this bijective contract, e.g. a function that sorts a immutable list of the supertype, can operate on the immutable list of the subtype.
So the conclusion is to enumerate all the invariants of each type (i.e. of its interfaces), make these types orthogonal (maximize granularity of composition), and then use function composition to accomplish extension where those invariants would not be orthogonal. Thus a subtype is appropriate only where it provably models the invariants of the supertype interface, and the additional interface(s) of the subtype are provably orthogonal to the invariants of the supertype interface. Thus the invariants of interfaces should be orthogonal.
Category theory provides rules for the model of the invariants of each subtype, i.e. of Functor, Applicative, and Monad, which preserve function composition on lifted types, i.e. see the aforementioned example of the power of subtyping for lists.
One reason is that equals() is very hard to get right when sub-typing is involved. See How to Write an Equality Method in Java. Specifically "Pitfall #4: Failing to define equals as an equivalence relation". In essence: to get equality right under sub-typing, you need a double dispatch.
I think the general context is for the lanaguage to be as "pure" as possible (ie using as much as possible pure functions), and comes from the comparison with Haskell.
From "Ruminations of a Programmer"
Scala, being a hybrid OO-FP language has to take care of issues like subtyping (which Haskell does not have).
As mentioned in this PSE answer:
no way to restrict a subtype so that it can't do more than the type it inherits from.
For example, if the base class is immutable and defines a pure method foo(...), derived classes must not be mutable or override foo() with a function that is not pure
But the actual recommendation would be to use the best solution adapted to the program you are currently developing.
Focusing on subtyping, ignoring the issues related to classes, inheritance, OOP, etc.. We have the idea subtyping represents a isa relation between types. For example, types A and B have different operations but if A isa B we then can use any of B's operations on an A.
OTOH, using another traditional relation, if C hasa B then we can reuse any of B's operations on a C. Usually languages let you write one with a nicer syntax, a.opOnB instead of a.super.opOnB as it would be in the case of composition, c.b.opOnB
The problem is that in many cases there's more than one way to relate two types. For example Real can be embedded in Complex assuming 0 on the imaginary part, but Complex can be embedded in Real by ignoring the imaginary part, so both can be seen as subtypes of the other and subtyping forces one relation to be viewed as preferred. Also, there are more possible relations (e.g. view Complex as a Real using theta component of polar representation).
In formal terminology we usually say morphism to such relations between types and there are special kinds of morphisms for relations with different properties (e.g. isomorphism, homomorphism).
In a language with subtyping usually there's much more sugar on isa relations and given many possible embeddings we tend to see unnecessary friction whenever we're using the unpreferred relation. If we bring inheritance, classes and OOP to the mix the problem becomes much more visible and messy.
My answer does not answer why it is avoided but tries to give another hint at why it can be avoided.
Using "type classes" you can add an abstraction over existing types/classes without modifying them. Inheritance is used to express that some classes are specializations of a more abstract class. But with type classes you can take any existing classes and express that they all share a common property, for example they are Comparable. And as long as you are not concerned with them being Comparable you don't even notice it. The classes don't inherit any methods from some abstract Comparable type as long as you don't use them. It's a bit like programming in dynamic languages.
Further reads:
http://blog.tmorris.net/the-power-of-type-classes-with-scala-implicit-defs/
http://debasishg.blogspot.com/2010/07/refactoring-into-scala-type-classes.html
I don't know Scala, but I think the mantra 'prefer composition over inheritance' applies for Scala exactly the way it does for every other OO programming language (and subtyping is often used with the same meaning as 'inheritance'). Here
Prefer composition over inheritance?
you will find some more information.
I think lots of Scala programmers are former Java programmers. They are used to think in term of Object Oriented subtyping and they should be able to easily find OO-like solution for most problems. But Functional Programing is a new paradigm to discover, so people ask for a different kind of solutions.
This is the best paper I have found on the subject. A motivating quote from the paper –
We argue that while some of the simpler aspects of object-oriented languages are
compatible with ML, adding a full-fledged class-based object system to ML leads to an excessively complex
type system and relatively little expressive gain

Achieving polymorphism in functional programming

I'm currently enjoying the transition from an object oriented language to a functional language. It's a breath of fresh air, and I'm finding myself much more productive than before.
However - there is one aspect of OOP that I've not yet seen a satisfactory answer for on the FP side, and that is polymorphism. i.e. I have a large collection of data items, which need to be processed in quite different ways when they are passed into certain functions. For the sake of argument, let's say that there are multiple factors driving polymorphic behaviour so potentially exponentially many different behaviour combinations.
In OOP that can be handled relatively well using polymorphism: either through composition+inheritance or a prototype-based approach.
In FP I'm a bit stuck between:
Writing or composing pure functions that effectively implement polymorphic behaviours by branching on the value of each data item - feels rather like assembling a huge conditional or even simulating a virtual method table!
Putting functions inside pure data structures in a prototype-like fashion - this seems like it works but doesn't it also violate the idea of defining pure functions separately from data?
What are the recommended functional approaches for this kind of situation? Are there other good alternatives?
Putting functions inside pure data structures in a prototype-like fashion - this seems like it works but doesn't it also violate the idea of defining pure functions separately from data?
If virtual method dispatch is the way you want to approach the problem, this is a perfectly reasonable approach. As for separating functions from data, that is a distinctly non-functional notion to begin with. I consider the fundamental principle of functional programming to be that functions ARE data. And as for your feeling that you're simulating a virtual function, I would argue that it's not a simulation at all. It IS a virtual function table, and that's perfectly OK.
Just because the language doesn't have OOP support built in doesn't mean it's not reasonable to apply the same design principles - it just means you'll have to write more of the machinery that other languages provide built-in, because you're fighting against the natural spirit of the language you're using. Modern typed functional languages do have very deep support for polymorphism, but it's a very different approach to polymorphism.
Polymorphism in OOP is a lot like "existential quantification" in logic - a polymorphic value has SOME run-time type but you don't know what it is. In many functional programming languages, polymorphism is more like "universal quantification" - a polymorphic value can be instantiated to ANY compatible type its user wants. They're two sides of the exact same coin (in particular, they swap places depending on whether you're looking at a function from the "inside" or the "outside"), but it turns out to be extremely hard when designing a language to "make the coin fair", especially in the presence of other language features such as subtyping or higher-kinded polymorphism (polymorphism over polymorphic types).
If it helps, you may want to think of polymorphism in functional languages as something very much like "generics" in C# or Java, because that's exactly the type of polymorphism that, e.g., ML and Haskell, favor.
Well, in Haskell you can always make a type-class to achieve a kind of polymorphism. Basically, it is defining functions that are processed for different types. Examples are the classes Eq and Show:
data Foo = Bar | Baz
instance Show Foo where
show Bar = 'bar'
show Baz = 'baz'
main = putStrLn $ show Bar
The function show :: (Show a) => a -> String is defined for every data type that instances the typeclass Show. The compiler finds the correct function for you, depending on the type.
This allows to define functions more generally, for example:
compare a b = a < b
will work with any type of the typeclass Ord. This is not exactly like OOP, but you even may inherit typeclasses like so:
class (Show a) => Combinator a where
combine :: a -> a -> String
It is up to the instance to define the actual function, you only define the type - similar to virtual functions.
This is not complete, and as far as I know, many FP languages do not feature type classes. OCaml does not, it pushes that over to its OOP part. And Scheme does not have any types. But in Haskell it is a powerful way to achieve a kind of polymorphism, within limits.
To go even further, newer extensions of the 2010 standard allow type families and suchlike.
Hope this helped you a bit.
Who said
defining pure functions separately from data
is best practice?
If you want polymorphic objects, you need objects. In a functional language, objects can be constructed by glueing together a set of "pure data" with a set of "pure functions" operating on that data. This works even without the concept of a class. In this sense, a class is nothing but a piece of code that constructs objects with the same set of associated "pure functions".
And polymorphic objects are constructed by replacing some of those functions of an object by different functions with the same signature.
If you want to learn more about how to implement objects in a functional language (like Scheme), have a look into this book:
Abelson / Sussman: "Structure and Interpration of Computer programs"
Mike, both your approaches are perfectly acceptable, and the pros and cons of each are discussed, as Doc Brown says, in Chapter 2 of SICP. The first suffers from having a big type table somewhere, which needs to be maintained. The second is just traditional single-dispatch polymorphism/virtual function tables.
The reason that scheme doesn't have a built-in system is that using the wrong object system for the problem leads to all sorts of trouble, so if you're the language designer, which to choose? Single despatch single inheritance won't deal well with 'multiple factors driving polymorphic behaviour so potentially exponentially many different behaviour combinations.'
To synopsize, there are many ways of constructing objects, and scheme, the language discussed in SICP, just gives you a basic toolkit from which you can construct the one you need.
In a real scheme program, you'd build your object system by hand and then hide the associated boilerplate with macros.
In clojure you actually have a prebuilt object/dispatch system built in with multimethods, and one of its advantages over the traditional approach is that it can dispatch on the types of all arguments. You can (apparently) also use the heirarchy system to give you inheritance-like features, although I've never used it, so you should take that cum grano salis.
But if you need something different from the object scheme chosen by the language designer, you can just make one (or several) that suits.
That's effectively what you're proposing above.
Build what you need, get it all working, hide the details with macros.
The argument between FP and OO is not about whether data abstraction is bad, it's about whether the data abstraction system is the place to stuff all the separate concerns of the program.
"I believe that a programming language should allow one to define new data types. I do not believe that a program should consist solely of definitions of new data types."
http://www.haskell.org/haskellwiki/OOP_vs_type_classes#Everything_is_an_object.3F nicely discusses some solutions.

Object-oriented programming in a purely functional programming context?

Are there any advantages to using object-oriented programming (OOP) in a functional programming (FP) context?
I have been using F# for some time now, and I noticed that the more my functions are stateless, the less I need to have them as methods of objects. In particular, there are advantages to relying on type inference to have them usable in as wide a number of situations as possible.
This does not preclude the need for namespaces of some form, which is orthogonal to being OOP. Nor is the use of data structures discouraged. In fact, real use of FP languages depend heavily on data structures. If you look at the F# stack implemented in F Sharp Programming/Advanced Data Structures, you will find that it is not object-oriented.
In my mind, OOP is heavily associated with having methods that act on the state of the object mostly to mutate the object. In a pure FP context that is not needed nor desired.
A practical reason may be to be able to interact with OOP code, in much the same way F# works with .NET. Other than that however, are there any reasons? And what is the experience in the Haskell world, where programming is more pure FP?
I will appreciate any references to papers or counterfactual real world examples on the issue.
The disconnect you see is not of FP vs. OOP. It's mostly about immutability and mathematical formalisms vs. mutability and informal approaches.
First, let's dispense with the mutability issue: you can have FP with mutability and OOP with immutability just fine. Even more-functional-than-thou Haskell lets you play with mutable data all you want, you just have to be explicit about what is mutable and the order in which things happen; and efficiency concerns aside, almost any mutable object could construct and return a new, "updated" instance instead of changing its own internal state.
The bigger issue here is mathematical formalisms, in particular heavy use of algebraic data types in a language little removed from lambda calculus. You've tagged this with Haskell and F#, but realize that's only half of the functional programming universe; the Lisp family has a very different, much more freewheeling character compared to ML-style languages. Most OO systems in wide use today are very informal in nature--formalisms do exist for OO but they're not called out explicitly the way FP formalisms are in ML-style languages.
Many of the apparent conflicts simply disappear if you remove the formalism mismatch. Want to build a flexible, dynamic, ad-hoc OO system on top of a Lisp? Go ahead, it'll work just fine. Want to add a formalized, immutable OO system to an ML-style language? No problem, just don't expect it to play nicely with .NET or Java.
Now, you may be wondering, what is an appropriate formalism for OOP? Well, here's the punch line: In many ways, it's more function-centric than ML-style FP! I'll refer back to one of my favorite papers for what seems to be the key distinction: structured data like algebraic data types in ML-style languages provide a concrete representation of the data and the ability to define operations on it; objects provide a black-box abstraction over behavior and the ability to easily replace components.
There's a duality here that goes deeper than just FP vs. OOP: It's closely related to what some programming language theorists call the Expression Problem: With concrete data, you can easily add new operations that work with it, but changing the data's structure is more difficult. With objects you can easily add new data (e.g., new subclasses) but adding new operations is difficult (think adding a new abstract method to a base class with many descendants).
The reason why I say that OOP is more function-centric is that functions themselves represent a form of behavioral abstraction. In fact, you can simulate OO-style structure in something like Haskell by using records holding a bunch of functions as objects, letting the record type be an "interface" or "abstract base class" of sorts, and having functions that create records replace class constructors. So in that sense, OO languages use higher-order functions far, far more often than, say, Haskell would.
For an example of something like this type of design actually put to very nice use in Haskell, read the source for the graphics-drawingcombinators package, in particular the way that it uses an opaque record type containing functions and combines things only in terms of their behavior.
EDIT: A few final things I forgot to mention above.
If OO indeed makes extensive use of higher-order functions, it might at first seem that it should fit very naturally into a functional language such as Haskell. Unfortunately this isn't quite the case. It is true that objects as I described them (cf. the paper mentioned in the LtU link) fit just fine. in fact, the result is a more pure OO style than most OO languages, because "private members" are represented by values hidden by the closure used to construct the "object" and are inaccessible to anything other than the one specific instance itself. You don't get much more private than that!
What doesn't work very well in Haskell is subtyping. And, although I think inheritance and subtyping are all too often misused in OO languages, some form of subtyping is quite useful for being able to combine objects in flexible ways. Haskell lacks an inherent notion of subtyping, and hand-rolled replacements tend to be exceedingly clumsy to work with.
As an aside, most OO languages with static type systems make a complete hash of subtyping as well by being too lax with substitutability and not providing proper support for variance in method signatures. In fact, I think the only full-blown OO language that hasn't screwed it up completely, at least that I know of, is Scala (F# seemed to make too many concessions to .NET, though at least I don't think it makes any new mistakes). I have limited experience with many such languages, though, so I could definitely be wrong here.
On a Haskell-specific note, its "type classes" often look tempting to OO programmers, to which I say: Don't go there. Trying to implement OOP that way will only end in tears. Think of type classes as a replacement for overloaded functions/operators, not OOP.
As for Haskell, classes are less useful there because some OO features are more easily achieved in other ways.
Encapsulation or "data hiding" is frequently done through function closures or existential types, rather than private members. For example, here is a data type of random number generator with encapsulated state. The RNG contains a method to generate values and a seed value. Because the type 'seed' is encapsulated, the only thing you can do with it is pass it to the method.
data RNG a where RNG :: (seed -> (a, seed)) -> seed -> RNG a
Dynamic method dispatch in the context of parametric polymorphism or "generic programming" is provided by type classes (which are not OO classes). A type class is like an OO class's virtual method table. However, there's no data hiding. Type classes do not "belong" to a data type the way that class methods do.
data Coordinate = C Int Int
instance Eq Coordinate where C a b == C d e = a == b && d == e
Dynamic method dispatch in the context of subtyping polymorphism or "subclassing" is almost a translation of the class pattern in Haskell using records and functions.
-- An "abstract base class" with two "virtual methods"
data Object =
Object
{ draw :: Image -> IO ()
, translate :: Coord -> Object
}
-- A "subclass constructor"
circle center radius = Object draw_circle translate_circle
where
-- the "subclass methods"
translate_circle center radius offset = circle (center + offset) radius
draw_circle center radius image = ...
I think that there are several ways of understanding what OOP means. For me, it is not about encapsulating mutable state, but more about organizing and structuring programs. This aspect of OOP can be used perfectly fine in conjunction with FP concepts.
I believe that mixing the two concepts in F# is a very useful approach - you can associate immutable state with operations working on that state. You'll get the nice features of 'dot' completion for identifiers, the ability to easy use F# code from C#, etc., but you can still make your code perfectly functional. For example, you can write something like:
type GameWorld(characters) =
let calculateSomething character =
// ...
member x.Tick() =
let newCharacters = characters |> Seq.map calculateSomething
GameWorld(newCharacters)
In the beginning, people don't usually declare types in F# - you can start just by writing functions and later evolve your code to use them (when you better understand the domain and know what is the best way to structure code). The above example:
Is still purely functional (the state is a list of characters and it is not mutated)
It is object-oriented - the only unusual thing is that all methods return a new instance of "the world"

Difference in compiler design for OOP languages

I'm doing research on how the design of a compiler for an OOP language differs from traditional imperative languages. I'd just like some topics to send me on my way, and if you wish, you can explain them.
For eg. I found that the type table is built differently.
Before the "compiler design" can be explored, I think the more fundamental question of "language design" needs to be addressed.
Should the language be statically typed? Dynamically typed? Early/late bound or a combination? Supporting generics? Is inference a goal? Should types be closed or open? How should sub-typing work? (Should implicit sub-typing be allowed at all?) Covariance? Contravariance? Single-inheritance? MI? SI with Traits? Explicit memberwise-selection? Prototypal (That is, should there even be a notion of "class" and "instance"?) Should types in nominative or based off of member signatures? Single-dispatch or multiple-dispatch? Are members invoked as first-class citizens or message-passing? Are types the same as classes? Is there a distinction between "value" and "reference" types? Etc, etc, etc... and this is just the tip of a very large iceberg.