In OCaml, what does aliasing a module do exactly? - module

In OCaml, to bring another module in scope you can use open. But what about code like this:
module A = struct
include B.C
module D = B.E
end
Does this create an entirely new module called A that has nothing to do with the modules created by B? Or are the types in B equivalent to this new structure and can a type in A.t can be used interchangeably with a type in B.C.t for example?
Especially, comparing to Rust I believe this is very different from writing something like
pub mod a {
pub use b::c::*;
pub use b::e as d;
}

Yes, module A = struct include B.C end creates an entirely new module and exports all definitions from B.C. All abstract types and data types that are imported from B.C are explicitly related to that module.
In other words, suppose you have
module Inner = struct
type imp = Foo
type t = int
end
so when we import Inner we can access the Inner definitions,
module A = struct
include Inner
let x : imp = Foo
let 1 : t = 1
end
and the Foo constructor in A belongs to the same type as the Foo constructor in the Inner module so that the following typechecks,
A.x = Inner.Foo
In other words, include is not a mere copy-paste, but something like this,
module A = struct
(* include Inner expands to *)
type imp = Inner.imp = Foo
type t = Inner.t = int
end
This operation of preserving type equalities is formally called strengthening and always applied when OCaml infers module type. In other words, the type system never forgets the type sharing constraints and the only way to remove them is to explicitly specify the module type that doesn't expose the sharing constraints (or use the module type of construct, see below).
For example, if we will define a module type
module type S = sig
type imp = Foo
type t = int
end
then
module A = struct
include (Inner : S)
end
will generate a new type foo, so A.Foo = Inner.Foo will no longer type check. The same could be achieved with the module type of construct that explicitly disables module type strengthening,
module A = struct
include (Inner : module type of Inner)
end
will again produce A.Foo that is distinct from Inner.Foo. Note that type t will be still compatible in all implementation as it is a manifest type and A.t is equal to Inner.t not via a sharing constraint but since both are equal to int.
Now, you might probably have the question, what is the difference between,
module A = Inner
and
module A = struct include Inner end
The answer is simple. Semantically they are equivalent. Moreover, the former is not a module alias as you might think. Both are module definitions. And both will define a new module A with exactly the same module type.
A module alias is a feature that exists on the (module) type level, i.e., in the signatures, e.g.,
module Main : sig
module A = Inner (* now this is the module alias *)
end = struct
module A = Inner
end
So what the module alias is saying, on the module level, is that A is not only has the same type as Inner but it is exactly the Inner module. Which opens to the compiler and toolchain a few opportunities. For example, the compiler may eschew module copying as well as enable module packing.
But all this has nothing to do with the observed semantics and especially with the typing. If we will forget about the explicit equality (that is again used mostly for more optimal module packing, e.g., in dune) then the following definition of the module A
module Main = struct
module A = Inner
end
is exactly the same as the above that was using the module aliasing. Anything that was typed with the previous definition will be typed with the new definition (modulo module type aliases). It is as strong. And the following is as strong,
module Main = struct
module A = struct include Inner end
end
and even the following,
module Main : sig
module A : sig
type imp = Impl.imp = Foo
type t = Impl.t = int
end
end = struct
module A = Impl
end

Related

External and internal interfaces & information hiding in OCaml

When creating a library from multiple modules, I cannot find a nice way to do proper information hiding to the user of the library (external interface) while being able to access everything I need on the internal interface.
To be more specific, I have two modules (Files a.ml[i] and b.ml[i]). In A, I define some type t, thats internals I don't want to hide from the user (external interface).
module A : sig
type t
end
module A = struct
type t = float
end
In module B, I then want to use the secret type of A.t.
module B : sig
create_a : float -> A.t
end
module B = struct
create_a x = x
end
This of course does not compile, because the compilation unit of B does not know the type of A.t.
Solutions I know, but don't like:
Move the function create_a to module A
Copy the definition of A.t to B and cheat the type checker with some external cheat : `a -> `b = "%identity"
Is there some other way to know the type of A.t in B without leaking this information to the library's interface?
As always an extra layer of indirection can solve this problem. Define a module Lib that will specify an external interface, e.g.,
module Lib : sig
module A : sig
type t
(* public interface *)
end
module B : sig
type t
(* public interface *)
end = struct
module A = A
module B = B
end
If you don't want to repeat yourself and write module signatures twice, then you can define them once in a module sigs.ml:
module Sigs = struct
module type A = sig
type t
(* public interface *)
end
(* alternatively, you can move it into sigs_priv.ml *)
module type A_private = sig
include A
val create_a : float -> t
end
...
end
Finally, make sure that you're not installing interfaces (the .cmi files),
during your installation step, so that users can't bypass your abstraction. If your're using oasis, then it is simple: just make all your modules internal, except the module Lib, i.e., specify them with InternalModules field.

Non-abstract types redundancy in Signature/Functor pattern

With the Signature/Functor pattern, I refer to the style of Map.S / Map.Make in the OCaml standard library. This pattern is highly successful when you want to parameterize a large piece of code over some type without making it fully polymorphic. Basically, you introduce a parameterized module by providing a signature (usually called S) and a constructor (Make).
However, when you take a closer look, there is a lot of redundancy in the declaration:
First, both the signature and the functor have to be announced in the .mli file
Second, the signature has to be repeated completely in the .ml file (is there actually any legal way to differ from the .mli file here?)
Finally, the functor itself has to repeat all definitions again to actually implement the module type
Summa summarum, I get 3 definition sites for non-abstract types (e.g. when I want to allow pattern matching). This is completely ridiculous, and thus I assume there is some way around. So my question is two-fold:
Is there a way to repeat a module type from an .mli file in an .ml file, without having to write it manually? E.g. something like ppx_import for module signatures?
Is there a way to include a module type in a module inside an .ml file? E.g. when the module type has only one abstract type definition, define that type and just copy the non-abstract ones?
You can already use ppx_import for module signatures. You can even use it in a .ml to query the corresponding .mli.
If a module is composed only of module signatures, you can define the .mli alone, without any .ml. This way you can define a module, let's say Foo_sigs, containing the signature and use it everywhere else.
Repeating type and module type definitions can be avoided to move them to external .ml file. Let's see the following example:
module M : sig
(* m.mli *)
module type S = sig
type t
val x : t
end
module type Result = sig
type t
val xs : t list
end
module Make(A : S) : Result with type t = A.t
end = struct
(* m.ml *)
module type S = sig
type t
val x : t
end
module type Result = sig
type t
val xs : t list
end
module Make(A : S) = struct
type t = A.t
let xs = [A.x;A.x]
end
end
Instead of writing two files m.mli and m.ml, I used a module M with an explicit signature: this is equivalent to have the two files and you can try it on OCaml toplevel by copy-and-paste.
In M, things are duped in sig .. end and struct .. end. This is cumbersome if module types become bigger.
You can share these dupes by moving them to another .ml file. For example, like the following n_intf.ml:
module N_intf = struct
(* n_intf.ml *)
module type S = sig
type t
val x : t
end
module type Result = sig
type t
val xs : t list
end
end
module N : sig
(* n.mli *)
open N_intf
module Make(A : S) : Result with type t = A.t
end = struct
(* n.ml *)
open N_intf
module Make(A : S) = struct
type t = A.t
let xs = [A.x;A.x]
end
end
You can also use *_intf.mli instead of *_intf.ml, but I recommend using *_intf.ml, since:
Module packing does not take mli only modules into account therefore you have to copy *_intf.cmi at installation.
Code generation from type definitions such as ppx_deriving needs things defined in .ml. In this example, it is no the case since there is no type definition.
In that specific case, you can just skip the .mli part:
Your abstraction is specified by the .ml
Reading it makes it quite clear (as people know the pattern from the stdlib)
Everything that you'd put in the .mli is already in the .ml
If you work in a group that's requiring you to actually give a mli, just generate it automatically by using the ocamlc -i trick.
ocamlc -i m.ml >m.mli # automatically generate mli from ml
I know it doesn't exactly answer your question, but hey, it solves your problem.
I know that always putting a mli is considered to be best practice, but it's not mandatory, and that may be for some very good reasons.
As for your second question, I'm not sure I understood it well but I think this answers it:
module type ToCopy = sig type t val f : t -> unit end
module type Copy1 = sig include ToCopy with type t = int end
module type Copy2 = ToCopy with type t = int;;
Adding to camlspoter's answer and since the question mentions pattern matching, maybe you want to "re-export" the signatures and types with constructors declared in N_intf so they are accessible through N instead. In that case, you can replace the open's with include and module type of, i.e.:
module N_intf = struct
type t = One | Two
(* n_intf.ml *)
module type S = sig
type t
val x : t
end
module type Result = sig
type t
val xs : t list
end
end
module N : sig
(* n.mli *)
include module type of N_intf
module Make(A : S) : Result with type t = A.t
end = struct
(* n.ml *)
include N_intf
module Make(A : S) = struct
type t = A.t
let xs = [A.x;A.x]
end
end
Then you'll get the following signatures:
module N_intf :
sig
type t = One | Two
module type S = sig type t val x : t end
module type Result = sig type t val xs : t list end
end
module N :
sig
type t = One | Two
module type S = sig type t val x : t end
module type Result = sig type t val xs : t list end
module Make : functor (A : S) -> sig type t = A.t val xs : t list end
end
Now the constructors One and Two can be qualified by N instead of N_intf, so you can ignore N_intf in the rest of the program.

Multiple arguments in functor, OCaml

I have the following (fairly abstract) piece of OCaml code, in which the last line gives an error "Syntax error: ')' expected" which is extremely vague for me
module type AT =
sig
type t
end;;
module type BT =
sig
type t
type a
end;;
module A : AT =
struct
type t = int
end;;
module B : BT =
struct
type a
type t = a list
end;;
module type ABT =
sig
type t
module InsideA : AT
module InsideB : BT
end;;
module ABT_Functor (AArg:AT) (BArg:BT with type a = AArg.t) : ABT =
struct
module InsideA = AArg
module InsideB = BArg
type t = Sth of InsideA.t * InsideB.t
end;;
module ABTA = ABT_Functor (A);;
module ABTAB = ABT_Functor (A) (B:BT with type a = A.t);;
However, when I change the last line to
module ABTAB = ABT_Functor (A) (B);;
I get a signature mismatch error, saying
Modules do not match:
sig type t = B.t type a = B.a end
is not included in
sig type t type a = A.t end
Type declarations do not match:
type a = B.a
is not included in
type a = A.t
But I don't really understand that error.
So, I hope it's quite clear what I want to achieve - I'd like to provide structures A and B to ABT_Functor functor, to obtain a structure ABTAB. How should I do that?
The general issue is that there are no constraints between module types AT and BT from the start, but you explicitely require one for your functor, so you need to provide one explicitely when using it if you force the module arguments to their minimal signatures.
In your precise case, the module A and B have been coerced respectively to AT and BT, without any additional information, thus making their inner types abstract. When passing them to the functor, even with additional type constraints, you cannot recover the implicitly existing relationship between A and B as it has been erased.
module A: AT with type t = int = struct type t = int end
module B: BT with type a = int = struct type a = int type t = int list end
(* with these definitions, you may use A and B without further ado as arguments to your functor *)
module ABTAB = ABT_Functor(A)(B) (* it works *)
Note that if you had not constrained A and B to begin with, it would have worked straight away.
module A = struct type t = int end
module B = struct type a = int type t = int list end
(* no constraints above *)
module ABTAB = ABT_Functor(A)(B)
Hm, your original version does not type-check either (the last line does not even parse), and for the same reason.
The reason is simple: in your definition of B you are implementing a as follows:
type a
Such a definition produces a distinct abstract type, i.e. a new type that is different from any other type. As a definition (not to be confused with a specification of a type in a signature), such a type is useless for pretty much anything but phantom types, because you cannot actually create any values of this type.
Nor can you change the definition of the type after the fact. This:
module Bint : sig type a = int end = B
is already ill-typed. Type B.a was defined as different from int. It only equals itself.
The problem you see with your functor application follows from there.

How can an OCaml module export a field defined in a dependent module?

I have a decomposition where module A defines a structure type, and exports a field of this type which is defined as a value in module B:
a.ml:
type t = {
x : int
}
let b = B.a
b.ml:
open A (* to avoid fully qualifying fields of a *)
let a : t = {
x = 1;
}
Circular dependence is avoided, since B only depends on type declarations (not values) in A.
a.mli:
type t = {
x : int
}
val b : t
As far as I know, this should be kosher. But the compiler errors out with this:
File "a.ml", line 1, characters 0-1:
Error: The implementation a.ml does not match the interface a.cmi:
Values do not match: val b : A.t is not included in val b : t
This is all particularly obtuse, of course, because it is unclear which val b is interpreted as having type t and which has type A.t (and to which A--the interface definition or the module definition--this refers).
I'm assuming there is some arcane rule (along the lines of the "structure fields must be referenced by fully module-qualified name when the module is not opened" semantics which bite every OCaml neophyte at some point), but I am so far at a loss.
Modules in the microscope are more subtle than it appears
(If your eyes glaze over at some point, skip to the second section.)
Let's see what would happen if you put everything in the same file. This should be possible since separate computation units do not increase the power of the type system. (Note: use separate directories for this and for any test with files a.* and b.*, otherwise the compiler will see the compilation units A and B which may be confusing.)
module A = (struct
type t = { x : int }
let b = B.a
end : sig
type t = { x : int }
val b : t
end)
module B = (struct
let a : A.t = { A.x = 1 }
end : sig
val a : A.t
end)
Oh, well, this can't work. It's obvious that B is not defined here. We need to be more precise about the dependency chain: define the interface of A first, then the interface of B, then the implementations of B and A.
module type Asig = sig
type t = { x : int }
type u = int
val b : t
end
module B = (struct
let a : Asig.t = { Asig.x = 1 }
end : sig
val a : Asig.t
end)
module A = (struct
type t = { x : int }
let b = B.a
end : Asig)
Well, no.
File "d.ml", line 7, characters 12-18:
Error: Unbound type constructor Asig.t
You see, Asig is a signature. A signature is a specification of a module, and no more; there is no calculus of signatures in Ocaml. You cannot refer to fields of a signature. You can only refer to fields of a module. When you write A.t, this refers to the type field named t of the module A.
In Ocaml, it is fairly rare for this subtlety to arise. But you tried poking at a corner of the language, and this is what's lurking there.
So what's going on then when there are two compilation units? A closer model is to see A as a functor which takes a module B as an argument. The required signature for B is the one described in the interface file b.mli. Similarly, B is a function which takes a module A whose signature is given in a.mli as an argument. Oh, wait, it's a bit more involved: A appears in the signature of B, so the interface of B is really defining a functor that takes an A and produces a B, so to speak.
module type Asig = sig
type t = { x : int }
type u = int
val b : t
end
module type Bsig = functor(A : Asig) -> sig
val a : A.t
end
module B = (functor(A : Asig) -> (struct
let a : A.t = { A.x = 1 }
end) : Bsig)
module A = functor(B : Bsig) -> (struct
type t = { x : int }
let b = B.a
end : Asig)
And here, when defining A, we run into a problem: we don't have an A yet, to pass as an argument to B. (Barring recursive modules, of course, but here we're trying to see why we can't get by without them.)
Defining a generative type is a side effect
The fundamental sticking point is that type t = {x : int} is a generative type definition. If this fragment appears twice in a program, two different types are defined. (Ocaml takes steps and forbids you to define two types with the same name in the same module, except at the toplevel.)
In fact, as we've seen above, type t = {x : int} in a module implementation is a generative type definition. It means “define a new type, called d, which is a record type with the fields …”. That same syntax can appear in a module interface, but there it has a different meaning: there, it means “the module defines a type t which is a record type …”.
Since defining a generative type twice creates two distinct types, the particular generative type that is defined by A cannot be fully described by the specification of the module A (its signature). Hence any part of the program that uses this generative type is really using the implementation of A and not just its specification.
When you get down to it, defining a generative type it is a form of side effect. This side effect happens at compile time or at program initialization time (the distinction between these two only appears when you start looking at functors, which I shall not do here.) So it is important to keep track of when this side effect happens: it happens when the module A is defined (compiled or loaded).
So, to express this more concretely: the type definition type t = {x : int} in the module A is compiled into “let t be type #1729, a fresh type which is a record type with a field …”. (A fresh type means one that is different from any type that has ever been defined before.). The definition of B defines a to have the type #1729.
Since the module B depends on the module A, A must be loaded before B. But the implementation of A clearly uses the implementation of B. The two are mutually recursive. Ocaml's error message is a little confusing, but you are indeed outstepping the bounds of the language.
(and to which A--the interface definition or the module definition--this refers).
A refers to the whole module A. With the normal build procedure it would refer to the implementation in a.ml contrained by signature in a.mli. But if you are playing tricks moving cmi's around and such - you are on your own :)
As far as I know, this should be kosher.
I personally qualify this issue as circular dependency and would stay strongly against structuring the code in such a way. IMHO it causes more problems and head-scratching, than solving real issues. E.g. moving shared type definitions to type.ml and be done with it is what comes first to mind. What is your original problem that leads to such structuring?

Difference between module <name> = struct .. end and module type <name> = struct.. end?

module <name> =
struct
..
end;;
module type <name> =
struct (* should have been sig *)
..
end;;
The first declares a module and the second declares a module type (aka a signature). A module type contains type and val declarations, whereas a module can contain definitions (e.g., let bindings). You can use a signature to restrict the type of a module, much as you might for a function. For example,
module type T = sig
val f : int -> int
end
module M : T = struct
let f x = x + 1
let g x = 2 * x
end
Now, we have
# M.f 0 ;;
- : int = 1
# M.g 0 ;;
Error: Unbound value M.g
M.g is unbound because it's hidden by the signature T.
Another common way to use module types is as arguments to and return values of functors. For example, the Map.Make functor in the standard library takes a module with signature Map.OrderedType and creates a module with signature Map.S
P.S. Note that there's a mistake in the question. A module type is declared using
module type <name> = sig
...
end
A structure (written struct … end) is a bunch of definitions. Any object in the language can be defined in a module: core values (let x = 2 + 2), types (type t = int), modules (module Empty = struct end), signatures (module type EMPTY = sig end), etc. Modules are a generalization of structures: a structure is a module, and so is a functor (think of it as a function that takes a module as argument and returns a new module). Modules are like core values, but live one level above: a module can contain anything, whereas a core value can only contain other core values¹.
A signature (written sig … end) is a bunch of specifications (some languages use the term declaration). Any object in the language can be specified in a module: core values (val x : int), types (type t = int), modules (module Empty : sig end), signatures (module type EMPTY = sig end), etc. Module types generalize signatures: a module type specifies a module, and a module type that happens to specify a structure is called a signature. Module types are to modules what ordinary types are to core values.
Compilation units (.ml files) are structures. Interfaces (.mli files) are signatures.
So module Foo = struct … end defines a module called Foo, which happens to be a structure. This is analogous to let foo = (1, "a") which defines a value called foo which happens to be a pair. And module type FOO = sig … end (note: sig, not struct) defines a module type called FOO, which happens to be a signature. This is analogous to type foo = int * string which defines a type called foo which happens to be a product type.
¹
This is in fact no longer true since OCaml 3.12 introduced first-class modules, but it's close enough for an introductory presentation.
module type describe a module. It is the same as the difference between .ml and .mli