Modelling object-oriented program in Coq - oop

I want to prove some facts about imperative object-oriented program. How can I represent a heterogeneous object graph in Coq? My main problem is that edges are implicit - each node consists of an integer label modelling object address and a data structure that models object state. So implicit edges are formed by fields inside data structure that model object pointers and contain address label of another node in a graph. To ensure that my graph is valid, adding new node to the graph must require a proof that all fields in a data structure that is being added refer to nodes that already exist in the graph. But how can I express 'all pointer fields in a data structure' in Coq?

It depends on how you represent a data structure, and what kinds of features the language you want to model has. Here's one possibility. Let's say that your language has two kinds of values: numbers and object references. We can write this type in Coq as:
Inductive value : Type :=
| VNum (n : nat)
| VRef (ref : nat).
A reference (or pointer) is just a natural number that can be used to uniquely identify objects on the heap. We can use functions to represent both objects and the heap as follows:
Definition object : Type := string -> option value.
Definition heap : Type := nat -> option object.
Paraphrasing in English, an object is a partial function from strings (which we use to model fields in the object) to values, and a heap is a partial function from nats (that is, object references) to objects. We can then express your property as:
Definition object_ok (o : object) (h : heap) : Prop :=
forall (s : string) (ref : nat),
o s = Some (VRef ref) ->
exists obj, h ref = Some obj.
Again, in English: if the field s of the object o is defined, and equal to a reference ref, then there exists some object obj stored at that address on the heap h.
The one problem with that representation is that Coq functions make it possible for heaps to have infinitely many objects, and objects to have infinitely many fields. You can circumvent this problem with an alternative representation that only allows for functions defined on finitely many inputs, such as lists of pairs, or (even better) a type of finite maps, such as this one.

Related

Can Functor instance be declared with additional type restriction for function

I'm working on porting GHC/Arr.hs into Frege.
Array is defined:
data Array i e = Array{u,l::i,n::Int,elems::(JArray e)}
There is function:
amap :: (Ix i, ArrayElem e) => (a -> b) -> Array i a -> Array i b
Now, I don't know how to define Functor instance for it, because
instance (Ix i) => Functor (Array i) where
fmap = amap
But compiler complains that inferred type is more constrained that expected, what seems true. Can I make Array an functor with restrction for functions ArrayElem -> ArrayElem?
No, this is not possible.
If you base Array on JArray and want a functor instance, you must not use any functions that arise the ArrayElem (or any other additional) context.
Another way to say this is that you cannot base Array on type safe java arrays, but must deal with java arrays of type Object[]. Because, as you have without doubt noted, the ArrayElem type class is just a trick to be able to provide the correct java type on creation of a java array. This is, of course, important for interfacing with Java and for performance reasons.
Note that there is another problem with type safe java arrays. Let's say we want to make an array of Double (but the same argument holds for any other element type). AFAIK, Haskell mandates that Arrays elements must be lazy. Hence we really cannot use the java type double[] (to which JArray Double would be the Frege counterpart) to model it. Because, if we would do this, every array element would have to be evaluated as soon as it is set.
For this reason, I suggest you use some general custom array element type, like
data AElem a = AE () a
mkAE = A ()
unAE (AE _ x) = x
derive ArrayElement AElem
and change your definition:
data Array i e = Array{u,l::i,n::Int,elems::(JArray (AElem e))}
Now, your functor instance can be written, because the ArrayElem constraint does not arise, because when you access the elems array, the compiler knows that you have AElem elements and can and will supply the correct instance.
In addition, construction of AElems and usage of AElems as actual array elements does not impose strictness on the actual value.
Needless to say, the user of the Array module should not (need to) know about those implementation details, that is, the AElem type.

Standard ML: Datatype vs. Structure

I'm reading through Paulson's ML For the Working Programmer and am a bit confused about the distinction between datatypes and structures.
On p. 142, he defines a type for binary trees as follows:
datatype 'a tree = Lf
| Br of 'a * 'a tree * 'a tree;
This seems to be a recursive definition where 'a denotes some fixed type. So any time I see 'a, it must refer to the same type throughout.
On p. 148, he discusses a structure for binary trees:
"...we have been following an imaginary ML session in which we typed in the tree functions one at a time. Now we ought to collect the most important of those functions into a structure, called Tree. We really must do so, because one of our functions (size) clashes with a built-in function. One reason for using structures is to prevent such name clashes.
We shall, however, leave the datatype declaration of tree outside of the structure. If it were inside, we should be forced to refer to the constructors by Tree.Lf and Tree.Br, which would make our patters unreadable. Thus, in the sequel, imagine that we have made the following declarations:
datatype 'a tree = Lf
| Br of 'a * 'a tree * 'a tree;
structure Tree =
struct
fun size Lf = 0
| size (Br( v, t1, t2)) = 1 + size t1 + size t2;
fun depth...
etc...
end;
I'm a little confused.
1) What is the relationship between a datatype and a structure?
2) What is the role of "struct" within the structure definition?
3) Later on, Paulson discusses a structure for dictionaries as binary search trees. He does the following:
structure Dict : DICTIONARY =
struct
type key = string;
type 'a t = (key * 'a) tree;
val empty = Lf;
<a bunch of functions for dictionaries>
This makes me think struct specifies the different primitive or compound types involved int he definition of a Dict.
That's a really fuzzy definition though. Anyone like to clarify?
Thanks for the help,
bclayman
A structure is a module. Everything between the struct and end keywords forms the body of this module. Similarly, you can view a signature as the description of an abstract module interface. Ascribing a signature to a structure (like the : DICTIONARY syntax does in your example) limits the exports of the module to what is specified in that signature (by default, everything would be accessible). That allows you to hide implementation details of a module.
However, ML modules are much richer than that. They can be arbitrarily nested. There are also functors, which are effectively functions from modules to modules ("parameterised modules", if you want). Altogether, the module language in ML forms a full functional language on its own, with structures as the basic entities, functors over them, and signatures describing the "types" of such modules. This little language is a layer on top of the so-called core language, where ordinary values and types live.
So, to answer your individual questions:
1) There is no specific relationship between the datatype and the structure. The latter simply uses the former.
2) struct-end is simply a keyword pair to delimit the structure body (languages in C tradition would probably use curly braces there).
3) As explained above, a structure is a basic module. It can contain (and export) arbitrary other language entities, including other modules. By grouping definitions together, and potentially hiding some of them through a signature ascription, you can express namespacing and encapsulation (in particular, abstract data types).
I should also note that Paulson's book is outdated regarding its description of modules, as it predates the current language version. In particular, it does not describe how to express abstract data types through modules, but instead introduces the obsolete abstype declaration which nobody has been using in almost 20 years. A more extensive and up-to-date introduction to modular programming in ML can be found in Harper's Programming in Standard ML.
In this example, the datatype 'a tree is describing a binary tree (https://en.wikipedia.org/wiki/Binary_tree) that is capable of storing any value of a single type. The 'a in the definition is a variant type which will later be constrained down to a concrete type wherever tree is used with a different type. This allows you to define the structure of a tree once and then use it with any type later on.
The Tree structure is separate from the datatype definition. It is being used to group functions together that operate on the 'a tree datatype. It is being used right now as a way to modularize the code and, as it points out, to prevent namespace clashes.
struct is just an identifier keyword to let the compiler know where your structure definition starts while the end keyword is used to let the compiler know where the definition ends.
The dictionary structure is defining a dictionary (a key -> value data structure) that uses a tree as the internal data structure. Once again, the structure is a collection of functions that will be used to create and operate on dictionaries. The types within the dictionary structure compose the type of the internal data structure that makes up the dictionary. The following functions define the public interface that you're exposing to allow clients to work with dictionaries.

Difference between modules and existentials

It's folk knowledge that OCaml modules are "just" existential types. That there's some kind of parity between
module X = struct type t val x : t end
and
data 'a spec = { x : 'a }
data x = X : 'a spec
and this isn't untrue exactly.
But as I just evidenced, OCaml has both modules and existential types. My question is:
How do they differ?
Is there anything which can be implemented in one but not the other?
When would you use one over the other (in particular comparing first-class modules with existential types)?
Completing gsg's answer on your third point.
There are two kinds of way to use modules:
As a structuring construct, when you declare toplevel modules. In that case you are not really manipulating existential variables. When encoding the module system in system-F, you would effectively represent the abstract types by existential variables, but morally, it is closer to a fresh singleton type.
As a value, when using first class modules. In that case you are clearly manipulating existential types.
The other representations of existential types are through GADT's and with objects. (It is also possible to encode existential as the negation of universal with records, but its usage are completely replaced by first class modules).
Choosing between those 3 cases depend a bit in the context.
If you want to provide a lot of functions for your type, you will prefer modules or objects. If only a few, you may find the syntax for modules or objects too heavywheight and prefer GADT. GADT's can also reveal a the structure of your type, for instance:
type _ ty =
| List : ty -> ty list
| Int : int list
type exist = E : 'a ty * 'a -> exist
If you are in that kind of case, you do not need to propagate the function working on that type, so you will end up with something a lot lighter with GADT's existentials. With modules this would look like
module type Exist = sig
type t
val t : t ty
end
module Int_list : Exist = struct
type t = int list
let t = List Int
end
let int_list = (module Int_list:Exist)
And if you need sub-typing or late binding, go for the objects. This can often be encoded with modules but this tend to be tedious.
It's specifically abstract types that have existential type. Modules without abstract types can be explained without existentials, I think.
Modules have features other than abstract types: they act as namespaces, they are structurally typed, they support operations like include and module type of, they allow private types, etc.
A notable difference is that functors allow ranging over types of any (fixed) arity, which is not possible with type variables because OCaml lacks higher kinded types:
module type M = sig
type 'a t
val x : 'a t
end
I'm not quite sure how to answer your last question. Modules and existentials are different enough in practice that the question of when to substitute one for the other hasn't come up.

Frege: can I derive "Show" for a recursive type?

I'm trying to implement the classical tree structure in frege, which works nicely as long as I don't use "derive":
data Tree a = Node a (Tree a) (Tree a)
| Empty
derive Show Tree
gives me
realworld/chapter3/E_Recursive_Types.fr:7: kind error,
type constructor `Tree` has kind *->*, expected was *
Is this not supported or do I have to declare it differently?
Welcome to the world of type kinds!
You must give the full type of the items you want to show. Tree is not a type (kind *), but something that needs a type parameter to become one (kind * -> *).
Try
derive Show (Tree a)
Note that this is shorthand for
derive Show (Show a => Tree a)
which resembles the fact that, to show a tree, you need to also know how to show the values in the tree (at least, the code generated by derive will need to know this - of course, one could write an instance manually that prints just the shape of the tree and so would not need it).
Generally, the kind needed in instances for every type class is fixed. The error message tells you that you need kind * for Show.
EDIT: eliminate another possible misconception
Note that this has nothing to do with your type being recursive. Let's take, for example, the definition of optional values:
data Maybe a = Nothing | Just a
This type is not recursive, and yet we still cannot say:
derive Show Maybe -- same kind error as above!!
But, given the following type class:
class ListSource c -- things we can make a list from
toList :: c a -> [a]
we need say:
instance ListSource Maybe where
toList (Just x) = [x]
toList Nothing = []
(instanceand derive are equivalent for the sake of this discussion, both make instances, the difference being that derive generates the instance functions automatically for certain type classes.)
It is, admittedly, not obvious why it is this way in one case and differntly in the other. The key is, in every case the type of the class operation we want to use. For example, in class Show we have:
class Show s where
show :: s -> String
Now, we see that the so called class type variable s (which represents any future instantiated type expression) appears on its own on the left of the function array. This, of course, indicates that s must be a plain type (kind *), because we pass a value to show and every value has per definition a type of kind *. We can have values of types Int or Maybe Int or Tree String, but no value ever has a type Maybe or Tree.
On the other hand, in the definition of ListSource, the class type variable c is applied to some other type variable a in the type of toList, which also appears as list element type. From the latter, we can conclude, that a has kind * (because list elements are values). We know, that the type to the left and to the right of a function arrow must have kind * also, since functions take and return values. Therefore, c a has kind *. Thus, c alone is something that, when applied to a type of kind * yields a type of kind *. This is written * -> *.
This means, in plain english, when we want to make an instance for ListSource we need the type constructor of some "container" type that is parameterized with another type. Tree and Maybe would be possible here, but not Int.

How to implement this OOP case in Haskell?

In the project I have several different types, defined in different modules and each of them has related functions (the functions have the same name and very similar meaning, so the following make sense). Now I want to create a list, in which it will be possible to have instances of all these types (simultaneously). The only possibility I can think of is something like this:
data Common = A{...} | B{...} | ...
but it implies keeping the definition in a single place, and not in different modules (for A, B, ...). Is there a better way to do this?
UPD
I'm rather new to haskell and write some programs related to my studying. In this case I have different FormalLanguage definition methods: FiniteAutomata, Grammars and so on. Each of them has common functions (isAccepted, representation, ...), so it seemed logical to have a list where elements can be of any of these types.
You are bringing an OOP mindset to Haskell by assuming the correct solution is to store distinct types in a list. I'll begin by examining that asssumption.
Usually we store distinct types in a homogeneous list because they support a common interface. Why not just factor out the common interface and store THAT in the list?
Unfortunately, your question does not describe what that common interface is, so I will just introduce a few common examples as demonstrations.
The first example would be a bunch of values, x, y, and z, that all support the Show function, which has the signature:
(Show a) => a -> String
Instead of storing the type we want to show later on, we could instead just call show directly on the values and store the resulting strings in the list:
list = [show x, show y, show z] :: String
There's no penalty for calling show prematurely because Haskell is a lazy language and won't actually evaluate the shows until we actually need the string.
Or perhaps the type supports multiple methods, such as:
class Contrived m where
f1 :: m -> String -> Int
f2 :: m -> Double
We can transform classes of the above form into equivalent dictionaries that contain the result of partially applying the methods to our values:
data ContrivedDict = ContrivedDict {
f1' :: String -> Int,
f2' :: Double }
... and we can use this dictionary to package any value into the common interface we expect it to support:
buildDict :: (Contrived m) => m -> ContrivedDict
buildDict m = ContrivedDict { f1' = f1 m, f2' = f2 m }
We can then store this common interface itself in the list:
list :: [buildDict x, buildDict y, buildDict z]
Again, instead of storing the distinctly-typed values, we've factored out their common elements for storage in the list.
However, this trick won't always work. The pathological example is any binary operator that expect two operands of equal type, such as the (+) operator from the Num class, which has the following type:
(Num a) => a -> a -> a
As far as I know, there is no good dictionary-based solution for partially applying a binary operation and storing it in such a way that it guarantees it is applied to a second operand of the same type. In this scenario the existential type class is probably the only valid approach. However, I recommend you stick to the dictionary-based approach when possible as it permits more powerful tricks and transformations than the type-class-based approach.
For more on this technique, I recommend you read Luke Palmer's article: Haskell Antipattern: Existential Typeclass.
There are few possibilities:
Possibility 1:
data Common = A AT | B BT | C CT
with AT, BT and CT described in their respective modules
Possibility 2:
{-# LANGUAGE ExistentialQuantification #-}
class CommonClass a where
f1 :: a -> Int
data Common = forall a . CommonClass a => Common a
which is almost the same as OOP superclass, but you cannot do "downcasts". You can then declare implementations for members of common classes in all the modules.
Possibility 3 suggested by #Gabriel Gonzalez:
data Common = Common {
f1 :: Int
}
So your modules implement common interface by using closures to abstract over the 'private' part.
However, Haskell design is usually radically different from OOP design. While it's possible to implement every OOP trick in Haskell, it will be likely non-idiomatic, so as #dflemstr said more information about your problem is welcome.