I want to use OCaml to generates sets of data and make comparisons between them. I have seen the documentation for Module types like Set.OrderType, Set.Make, etc, but I can't figure out how to initialize a set or otherwise use them.
Sets are defined using a functorial interface. For any given type, you have to create a Set module for that type using the Set.Make functor. An unfortunate oversight of the standard libraries is that they don't define Set instances for the built-in types. In most simple cases, it's sufficient to use Pervasives.compare. Here's a definition that works for int:
module IntSet = Set.Make(
struct
let compare = Pervasives.compare
type t = int
end )
The module IntSet will implement the Set.S interface. Now you can operate on sets using the IntSet module:
let s = IntSet.empty ;;
let t = IntSet.add 1 s ;;
let u = IntSet.add 2 s ;;
let tu = IntSet.union t u ;;
Note that you don't have to explicitly define the input structure for Set.Make as an OrderedType; type inference will do the work for you. Alternatively, you could use the following definition:
module IntOrder : Set.OrderedType = struct
type t = int
let compare = Pervasives.compare
end
module IntSet = Set.Make( IntOrder )
This has the advantage that you can re-use the same module to instantiate a Map:
module IntMap = Map.Make( IntOrder )
You lose some genericity in using functors, because the type of the elements is fixed. For example, you won't be able to define a function that takes a Set of some arbitrary type and performs some operation on it. (Luckily, the Set module itself declares many useful operations on Sets.)
In addition to Chris's answer, it may be useful to say that some standard library modules already adhere to the OrderedType signature. For example, you can simply do:
module StringSet = Set.Make(String) ;; (* sets of strings *)
module Int64Set = Set.Make(Int64) ;; (* sets of int64s *)
module StringSetSet = Set.Make(StringSet) ;; (* sets of sets of strings *)
And so on.
Here's a simple usage example for StringSet; remember that sets are functional data structures, so adding a new element to a set returns a new set:
let set = List.fold_right StringSet.add ["foo";"bar";"baz"] StringSet.empty ;;
StringSet.mem "bar" set ;; (* returns true *)
StringSet.mem "zzz" set ;; (* returns false *)
Related
In OCaml, to bring another module in scope you can use open. But what about code like this:
module A = struct
include B.C
module D = B.E
end
Does this create an entirely new module called A that has nothing to do with the modules created by B? Or are the types in B equivalent to this new structure and can a type in A.t can be used interchangeably with a type in B.C.t for example?
Especially, comparing to Rust I believe this is very different from writing something like
pub mod a {
pub use b::c::*;
pub use b::e as d;
}
Yes, module A = struct include B.C end creates an entirely new module and exports all definitions from B.C. All abstract types and data types that are imported from B.C are explicitly related to that module.
In other words, suppose you have
module Inner = struct
type imp = Foo
type t = int
end
so when we import Inner we can access the Inner definitions,
module A = struct
include Inner
let x : imp = Foo
let 1 : t = 1
end
and the Foo constructor in A belongs to the same type as the Foo constructor in the Inner module so that the following typechecks,
A.x = Inner.Foo
In other words, include is not a mere copy-paste, but something like this,
module A = struct
(* include Inner expands to *)
type imp = Inner.imp = Foo
type t = Inner.t = int
end
This operation of preserving type equalities is formally called strengthening and always applied when OCaml infers module type. In other words, the type system never forgets the type sharing constraints and the only way to remove them is to explicitly specify the module type that doesn't expose the sharing constraints (or use the module type of construct, see below).
For example, if we will define a module type
module type S = sig
type imp = Foo
type t = int
end
then
module A = struct
include (Inner : S)
end
will generate a new type foo, so A.Foo = Inner.Foo will no longer type check. The same could be achieved with the module type of construct that explicitly disables module type strengthening,
module A = struct
include (Inner : module type of Inner)
end
will again produce A.Foo that is distinct from Inner.Foo. Note that type t will be still compatible in all implementation as it is a manifest type and A.t is equal to Inner.t not via a sharing constraint but since both are equal to int.
Now, you might probably have the question, what is the difference between,
module A = Inner
and
module A = struct include Inner end
The answer is simple. Semantically they are equivalent. Moreover, the former is not a module alias as you might think. Both are module definitions. And both will define a new module A with exactly the same module type.
A module alias is a feature that exists on the (module) type level, i.e., in the signatures, e.g.,
module Main : sig
module A = Inner (* now this is the module alias *)
end = struct
module A = Inner
end
So what the module alias is saying, on the module level, is that A is not only has the same type as Inner but it is exactly the Inner module. Which opens to the compiler and toolchain a few opportunities. For example, the compiler may eschew module copying as well as enable module packing.
But all this has nothing to do with the observed semantics and especially with the typing. If we will forget about the explicit equality (that is again used mostly for more optimal module packing, e.g., in dune) then the following definition of the module A
module Main = struct
module A = Inner
end
is exactly the same as the above that was using the module aliasing. Anything that was typed with the previous definition will be typed with the new definition (modulo module type aliases). It is as strong. And the following is as strong,
module Main = struct
module A = struct include Inner end
end
and even the following,
module Main : sig
module A : sig
type imp = Impl.imp = Foo
type t = Impl.t = int
end
end = struct
module A = Impl
end
In F#, I'd simply do:
> let x = Set.empty;;
val x : Set<'a> when 'a : comparison
> Set.add (2,3) x;;
val it : Set<int * int> = set [(2, 3)]
I understand that in OCaml, when using Base, I have to supply a module with comparison functions, e.g., if my element type was string
let x = Set.empty (module String);;
val x : (string, String.comparator_witness) Set.t = <abstr>
Set.add x "foo";;
- : (string, String.comparator_witness) Set.t = <abstr>
But I don't know how to construct a module that has comparison functions for the type int * int. How do I construct/obtain such a module?
To create an ordered data structure, like Map, Set, etc, you have to provide a comparator. In Base, a comparator is a first-class module (a module packed into a value) that provides a comparison function and a type index that witnesses this function. Wait, what? Later on that, let us first define a comparator. If you already have a module that has type
module type Comparator_parameter = sig
type t (* the carrier type *)
(* the comparison function *)
val compare : t -> t -> int
(* for introspection and debugging, use `sexp_of_opaque` if not needed *)
val sexp_of_t : t -> Sexp.t
end
then you can just provide to the Base.Comparator.Make functor and build the comparator
module Lexicographical_order = struct
include Pair
include Base.Comparator.Make(Pair)
end
where the Pair module provides the compare function,
module Pair = struct
type t = int * int [##deriving compare, sexp_of]
end
Now, we can use the comparator to create ordered structures, e.g.,
let empty = Set.empty (module Lexicographical_order)
If you do not want to create a separate module for the order (for example because you can't come out with a good name for it), then you can use anonymous modules, like this
let empty' = Set.empty (module struct
include Pair
include Base.Comparator.Make(Pair)
end)
Note, that the Pair module, passed to the Base.Comparator.Make functor has to be bound on the global scope, otherwise, the typechecker will complain. This is all about this witness value. So what this witness is about and what it witnesses.
The semantics of any ordered data structure, like Map or Set, depends on the order function. It is an error to compare two sets which was built with different orders, e.g., if you have two sets built from the same numbers, but one with the ascending order and another with the descending order they will be treated as different sets.
Ideally, such errors should be prevented by the type checker. For that we need to encode the order, used to build the set, in the set's type. And this is what Base is doing, let's look into the empty' type,
val empty' : (int * int, Comparator.Make(Pair).comparator_witness) Set.t
and the empty type
val empty : (Lexicographical_order.t, Lexicographical_order.comparator_witness) Set.t
Surprisingly, the compiler is able to see through the name differences (because modules have structural typing) and understand that Lexicographical_order.comparator_witness and Comparator.Make(Pair).comparator_witness are witnessing the same order, so we can even compare empty and empty',
# Set.equal empty empty';;
- : bool = true
To solidify our knowledge lets build a set of pairs in the reversed order,
module Reversed_lexicographical_order = struct
include Pair
include Base.Comparator.Make(Pair_reveresed_compare)
end
let empty_reveresed =
Set.empty (module Reversed_lexicographical_order)
(* the same, but with the anonyumous comparator *)
let empty_reveresed' = Set.empty (module struct
include Pair
include Base.Comparator.Make(Pair_reveresed_compare)
end)
As before, we can compare different variants of reversed sets,
# Set.equal empty_reversed empty_reveresed';;
- : bool = true
But comparing sets with different orders is prohibited by the type checker,
# Set.equal empty empty_reveresed;;
Characters 16-31:
Set.equal empty empty_reveresed;;
^^^^^^^^^^^^^^^
Error: This expression has type
(Reversed_lexicographical_order.t,
Reversed_lexicographical_order.comparator_witness) Set.t
but an expression was expected of type
(Lexicographical_order.t, Lexicographical_order.comparator_witness) Set.t
Type
Reversed_lexicographical_order.comparator_witness =
Comparator.Make(Pair_reveresed_compare).comparator_witness
is not compatible with type
Lexicographical_order.comparator_witness =
Comparator.Make(Pair).comparator_witness
This is what comparator witnesses are for, they prevent very nasty errors. And yes, it requires a little bit of more typing than in F# but is totally worthwhile as it provides more typing from the type checker that is now able to detect real problems.
A couple of final notes. The word "comparator" is an evolving concept in Janestreet libraries and previously it used to mean a different thing. The interfaces are also changing, like the example that #glennsl provides is a little bit outdated, and uses the Comparable.Make module instead of the new and more versatile Base.Comparator.Make.
Also, sometimes the compiler will not be able to see the equalities between comparators when types are abstracted, in that case, you will need to provide sharing constraints in your mli file. You can take the Bitvec_order library as an example. It showcases, how comparators could be used to define various orders of the same data structure and how sharing constraints could be used. The library documentation also explains various terminology and gives a history of the terminology.
And finally, if you're wondering how to enable the deriving preprocessors, then
for dune, add (preprocess (pps ppx_jane)) stanza to your library/executable spec
for ocamlbuild add -pkg ppx_jane option;
for topelevel (e.g., ocaml or utop) use #require "ppx_jane";; (if require is not available, then do #use "topfind;;", and then repeat).
There are examples in the documentation for Map showing exactly this.
If you use their PPXs you can just do:
module IntPair = struct
module T = struct
type t = int * int [##deriving sexp_of, compare]
end
include T
include Comparable.Make(T)
end
otherwise the full implementation is:
module IntPair = struct
module T = struct
type t = int * int
let compare x y = Tuple2.compare Int.compare Int.compare
let sexp_of_t = Tuple2.sexp_of_t Int.sexp_of_t Int.sexp_of_t
end
include T
include Comparable.Make(T)
end
Then you can create an empty set using this module:
let int_pair_set = Set.empty (module IntPair)
Motivation
For the life of me, I cannot figure out how to use higher order functors in
SML/NJ to any practical end.
According to the
SML/NJ docs on the implementation's special features,
it should be possible to specify one functor as an argument to another by use of
the funsig keyword. Thus, given a signature
signature SIG = sig ... end
we should be able to specify a functor that will produce a module satisfying
SIG, when applied to a structure satisfying some signature SIG'. E.g.,
funsig Fn (S:SIG') = SIG
With Fn declared in this way, we should then (be able to define another
functor that takes this functor as an argument. I.e., we can define a module
that is parameterized over another parameterized module, and presumably use the
latter within the former; thus:
functor Fn' (functor Fn:SIG) =
struct
...
structure S' = Fn (S:SIG')
...
end
It all looks good in theory, but I can't figure out how to actually make use of
this pattern.
Example Problems
Here are two instances where I've tried to use this pattern, only to find
it impracticable:
First attempt
For my first attempt, just playing around, I tried to make a functor that would
take a functor implementing an ordered set, and produce a module for dealing
with sets of integers (not really useful, but it would let you parameterize sets
of a given type over different set implementations). I can define the
following structures, and they will compile (using Standard ML of New Jersey
v110.7):
structure IntOrdKey : ORD_KEY
= struct
type ord_key = int
val compare = Int.compare
end
funsig SET_FN (KEY:ORD_KEY) = ORD_SET
functor IntSetFn (functor SetFn:SET_FN) =
struct
structure Set = SetFn (IntOrdKey)
end
But when I actually try to apply IntSetFn to a functor that should satisfy the
SET_FN funsig, it just doesn't parse:
- structure IntSet = IntSetFn (functor ListSetFn);
= ;
= ;;
stdIn:18.1-24.2 Error: syntax error: deleting RPAREN SEMICOLON SEMICOLON
- structure IntSet = IntSetFn (functor BinarySetFn) ;
= ;
= ;
stdIn:19.1-26.2 Error: syntax error: deleting RPAREN SEMICOLON SEMICOLON
Second attempt
My second attempt fails in two ways.
I have defined a structure of nested modules implementing polymorphic and
monomorphic stacks (the source file, for the curious). To
implement a monomorphic stack, you do
- structure IntStack = Collect.Stack.Mono (type elem = int);
structure IntStack : MONO_STACK?
- IntStack.push(1, IntStack.empty);
val it = - : IntStack.t
and so forth. It seems to work fine so far. Now, I want to define a module that
parameterizes over this functor. So I have defined a funsig for the
Collect.Stack.Mono functor (which can be seen in my repo). Then, following the
pattern indicated above, I tried to define the following test module:
(* load my little utility library *)
CM.autoload("../../../utils/sources.cm");
functor T (functor StackFn:MONO_STACK) =
struct
structure S = StackFn (type elem = int)
val x = S.push (1, S.empty)
end
But this won't compile! I get a type error:
Error: operator and operand don't agree [overload conflict]
operator domain: S.elem * S.t
operand: [int ty] * S.t
in expression:
S.push (1,S.empty)
uncaught exception Error
raised at: ../compiler/TopLevel/interact/evalloop.sml:66.19-66.27
../compiler/TopLevel/interact/evalloop.sml:44.55
../compiler/TopLevel/interact/evalloop.sml:292.17-292.20
Yet, inside functor T, I appear to be using the exact same instantiation
pattern that works perfectly at the top level. What am I missing?
Unfortunately, that's not the end of my mishaps. Now, I remove the line
causing the type error, leaving,
functor T (functor StackFn:MONO_STACK) =
struct
structure S = StackFn (type elem = int)
end
This compiles fine:
[scanning ../../../utils/sources.cm]
val it = true : bool
[autoloading]
[autoloading done]
functor T(<param>: sig functor StackFn : <fctsig> end) :
sig
structure S : <sig>
end
val it = () : unit
But I cannot actually instantiate the module! Apparently the path access syntax
is unsupported for higher order functors?
- structure Test = T (functor Collect.Stack.Mono);
stdIn:43.36-43.43 Error: syntax error: deleting DOT ID DOT
I am at a lost.
Questions
I have three related questions:
Is there a basic principle of higher-order functors in SML/NJ that I'm
missing, or is it just an incompletely, awkwardly implemented feature of the
language?
If the latter, where can I turn for more elegant and practicable higher order
functors? (Hopefully an SML, but I'll dip into OCaml if necessary.)
Is there perhaps a different approach I should taking to achieve these kinds
of effects that avoids higher order functors all together?
Many thanks in advance for any answers, hints, or followup questions!
Regarding your first attempt, the right syntax to apply your IntSetFn functor is:
structure IntSet = IntSetFn (functor SetFn = ListSetFn)
The same applies to your application of the Test functor in the second attempt:
structure Test = T (functor StackFn = Collect.Stack.Mono)
That should fix the syntax errors.
The type error you get when trying to use your stack structure S inside functor T has to do with the way you defined the MONO_STACK funsig:
funsig MONO_STACK (E:ELEM) = MONO_STACK
This just says that it returns a MONO_STACK structure, with a fully abstract elem type. It does not say that its elem type is gonna be the same as E.elem. According to that, I would able to pass in a functor like
functor F (E : ELEM) = struct type elem = unit ... end
to your functor T. Hence, inside T, the type system is not allowed to assume that type S.elem = int, and consequently you get a type error.
To fix this, you need to refine the MONO_STACK funsig as follows:
funsig MONO_STACK (E:ELEM) = MONO_STACK where type elem = E.elem
That should eliminate the type error.
[Edit]
As for your questions:
Higher-order functors are a little awkward syntactically in SML/NJ because it tries to stay 100% compatible with plain SML, which separates the namespace of functors from that for structures. If that wasn't the case then there wouldn't be the need for funsigs as a separate namespace either (and other syntactic baroqueness), and the language of signatures could simply be extended to include functor types.
Moscow ML is another SML dialect with a higher-order module extension that resolves the compatibility issue somewhat more elegantly (and is more expressive). There also was (now mostly dead) ALice ML, yet another SML dialect with higher-order functors that simply dropped the awkward namespace separation. OCaml of course did not have this constraint in the first place, so its higher-order modules are also more regular syntactically.
The approach seems fine.
I want to write a function that, given a non-negative integer n, returns the power set of {1,...,n}. So I want to use the Set.S module as found here. But I can't seem to import it. When I run the following code:
open Set.S
let rec power_set n =
if n = 0 then add empty empty else union (iter (add n s) power_set (n-1)) (power_set (n-1));;
let print_set s = SS.iter print_endline s;;
print_set (power_set 2)
I get the error:
File "countTopologies.ml", line 1, characters 5-10:
Error: Unbound module Set.S
Maybe I just don't have the Set.S module installed on my computer? (I've only done the bare bones needed to install OCaml). If this is the case, how would I get it?
The Set.S is a module type, not a module. You can open only modules. In fact, the module Set contains three elements:
the module type OrderedType that denotes the type of modules that implement ordered types;
the module type S that denotes the type of modules that implement Set data structures;
the functor Make that takes a module of type OrderedType and returns a module of type S.
To get a set module you need to create it using the Set.Make functor. The functor has one parameter - the module for the set elements. In modern OCaml (4.08+) you can create a set module for integers as easy as,
module Ints = Set.Make(Int)
and then you can use like this,
let numbers = Ints.of_list [1;2;3]
assert (Ints.mem 2 numbers)
For older versions of OCaml, which doesn't provide the Int module, or for non-standard (custom) types, you need to define your own module that implements the OrderedType interface, e.g.,
module Int = struct
type t = int
(* use Pervasives compare *)
let compare = compare
end
module Ints = Set.Make(Int)
You can also use non-standard libraries, like Janestreet's Core library, which provide sets out of box. The Core library has an Int module that is already charged with sets, maps, hashtables, so it can be accessed without any functors:
open Core.Std
let nil = Int.Set.empty
Or, in the modern (2018-2019) version of Janestreet Core or Base libraries, you can use polymorphic sets/maps, which require you to specify the module for keys only when a new set or map is created, e.g., like this
open Base (* or Core, or Core_kernel *)
let nil = Set.empty (module Int)
let one = Set.add nil 1
let two = Set.singleton (module Int) 2
You have to Make a set module from the Set functor.
module SI = Set.Make(struct type t = int let compare = compare end)
Then you can have a set of ints:
# let myset = SI.add 3 SI.empty;;
val myset : SI.t = <abstr>
# SI.elements myset;;
- : SI.elt list = [3]
I'm quite stuck with the following functor problem in OCaml. I paste some of the code just to let you understand. Basically
I defined these two modules in pctl.ml:
module type ProbPA = sig
include Hashtbl.HashedType
val next: t -> (t * float) list
val print: t -> float -> unit
end
module type M = sig
type s
val set_error: float -> unit
val check: s -> formula -> bool
val check_path: s -> path_formula -> float
val check_suite: s -> suite -> unit
end
and the following functor:
module Make(P: ProbPA): (M with type s = P.t) = struct
type s = P.t
(* implementation *)
end
Then to actually use these modules I defined a new module directly in a file called prism.ml:
type state = value array
type t = state
type value =
| VBOOL of bool
| VINT of int
| VFLOAT of float
| VUNSET
(* all the functions required *)
From a third source (formulas.ml) I used the functor with Prism module:
module PrismPctl = Pctl.Make(Prism)
open PrismPctl
And finally from main.ml
open Formulas.PrismPctl
(* code to prepare the object *)
PrismPctl.check_suite s.sys_state suite (* error here *)
and compiles gives the following error
Error: This expression has type Prism.state = Prism.value array
but an expression was expected of type Formulas.PrismPctl.s
From what I can understand there a sort of bad aliasing of the names, they are the same (since value array is the type defined as t and it's used M with type s = P.t in the functor) but the type checker doesn't consider them the same.
I really don't understand where is the problem, can anyone help me?
Thanks in advance
(You post non-compilable code. That's a bad idea because it may make it harder for people to help you, and because reducing your problem down to a simple example is sometimes enough to solve it. But I think I see your difficulty anyway.)
Inside formulas.ml, Ocaml can see that PrismPctl.s = Pctl.Make(Prism).t = Prism.t; the first equality is from the definition of PrismPctl, and the second equality is from the signature of Pctl.Make (specifically the with type s = P.t bit).
If you don't write an mli file for Formulas, your code should compile. So the problem must be that the .mli file you wrote doesn't mention the right equality. You don't show your .mli files (you should, they're part of the problem), but presumably you wrote
module PrismPctl : Pctl.M
That's not enough: when the compiler compiles main.ml, it won't know anything about PrismPctl that's not specified in formulas.mli. You need to specify either
module PrismPctl : Pctl.M with type s = Prism.t
or, assuming you included with type s = P.t in the signature of Make in pctl.mli
module PrismPctl : Pctl.M with type s = Pctl.Make(Prism).s
This is a problem I ran into as well when learning more about these. When you create the functor you expose the signature of the functor, in this case M. It contains an abstract type s, parameterized by the functor, and anything more specific is not exposed to the outside. Thus, accessing any record element of s (as in sys_state) will result in a type error, as you've encountered.
The rest looks alright. It is definitely hard to get into using functors properly, but remember that you can only manipulate instances of the type parameterized by the functor through the interface/signature being exposed by the functor.