SML/NJ: How to use HashTable? - structure

I really want to create a HashTable in SML, it seems there already is a structure for this in SML/NJ.
The question is, how do I use it? I've not fully understood how to use structures in SML, and some of the very basic examples in the book I read gives me errors I don't even know how to correct, so using the HashTable structure might be an easy thing, but I wouldn't know. If someone could explain this, then that'd be wonderful too!
I'm thinking it's something like this:
val ht : string * int HashTable.hash_table = HashTable.mkTable();
???

The signature of the mkTable value is:
val mkTable : (('a -> word) * (('a * 'a) -> bool)) -> (int * exn)
-> ('a,'b) hash_table
(* Given a hashing function and an equality predicate, create a new table;
* the int is a size hint and the exception is to be raised by find.
*)
Therefore, you would have to do something like:
val ht : (string, int) HashTable.hash_table =
HashTable.mkTable (HashString.hashString, op=) (42, Fail "not found")

I assume the idea is to create a table mapping strings to integers. Then you want to write its type as (string, int) hash_table (the type hash_table is a type with two parameters, which are written like that in ML).
But you also need a hash function hash : string -> word and an equality function eq : string * string -> bool over strings to provide to mkTable. For the latter, you can simply use op=, for the former you can use HashString.hashString from the respective module.
So,
val ht : (string, int) HashTable.hash_table = HashTable.mkTable(HashString.hashString, op=)(17, Domain)
should work.
I should note, however, that hash tables tend to be vastly overused, and more often than not they are the wrong data structure. This is especially true in functional programming, since they are a stateful data structure. Usually you are better off (and potentially even more efficient) using some tree-based map, e.g., the RedBlackMapFn from the SML/NJ library.

Related

Partial application of Printf.ksprintf

I'm trying to write a version of Printf.printf that always appends a newline character after writing its formatted output. My first attempt was
# let say fmt = Printf.ksprintf print_endline fmt;;
val say : ('a, unit, string, unit) format4 -> 'a = <fun>
The type signature looks right and say works as expected. I noticed that fmt is listed twice, and thought that partial application could eliminate it. So I tried this instead:
# let say = Printf.ksprintf print_endline;;
val say : ('_weak1, unit, string, unit) format4 -> '_weak1 = <fun>
The function definition looks cleaner, but the type signature looks wrong and say no longer works as expected. For example, say doesn't type check if the format string needs a variable number of arguments: I get an error that say "is applied to too many arguments".
I can use the let say fmt = … implementation, but why doesn't partial application work?
OCaml's type-checker loses polymorphism during partial application. That is, when you partially apply a function, the resulting function is no longer polymorphic. That's why you see '_weak1 in the second type signature.
When you include the fmt argument, you help the type-checker recognize that polymorphism is still present.
This process is called "eta conversion." Removing your fmt argument is "eta reduction" and adding it back in is called "eta expansion." You may encounter that terminology when working with other functional programming languages.
This is the value restriction at work: https://ocaml.org/manual/polymorphism.html#s:weak-polymorphism . In brief, only syntactic values can be safely generalized in let-binding in presence of mutable variables in the language.
In particular,
let f = fun x -> g y x
is a syntactic value that can be generalized, whereas
let f = g y
is a computation that cannot (always) be generalized.
A example works quite well to illustrate the issue, consider:
let fake_pair x =
let store = ref None in
fun y ->
match !store with
| None ->
store := Some y;
x, y
| Some s ->
x, s
then the type of fake_pair is 'a -> 'b -> 'a * 'b.
However, once partially applied
let p = fake_pair 0
we have initialized the store mutable value, and it is important that all subsequent call to p share the same type (because they must match the stored value). Thus the type of p is '_weak1 -> int * '_weak1 where '_weak1 is a weak type variable, aka a temporary placeholder for a concrete type.

In OCaml using Base, how do you construct a set with elements of type `int * int`?

In F#, I'd simply do:
> let x = Set.empty;;
val x : Set<'a> when 'a : comparison
> Set.add (2,3) x;;
val it : Set<int * int> = set [(2, 3)]
I understand that in OCaml, when using Base, I have to supply a module with comparison functions, e.g., if my element type was string
let x = Set.empty (module String);;
val x : (string, String.comparator_witness) Set.t = <abstr>
Set.add x "foo";;
- : (string, String.comparator_witness) Set.t = <abstr>
But I don't know how to construct a module that has comparison functions for the type int * int. How do I construct/obtain such a module?
To create an ordered data structure, like Map, Set, etc, you have to provide a comparator. In Base, a comparator is a first-class module (a module packed into a value) that provides a comparison function and a type index that witnesses this function. Wait, what? Later on that, let us first define a comparator. If you already have a module that has type
module type Comparator_parameter = sig
type t (* the carrier type *)
(* the comparison function *)
val compare : t -> t -> int
(* for introspection and debugging, use `sexp_of_opaque` if not needed *)
val sexp_of_t : t -> Sexp.t
end
then you can just provide to the Base.Comparator.Make functor and build the comparator
module Lexicographical_order = struct
include Pair
include Base.Comparator.Make(Pair)
end
where the Pair module provides the compare function,
module Pair = struct
type t = int * int [##deriving compare, sexp_of]
end
Now, we can use the comparator to create ordered structures, e.g.,
let empty = Set.empty (module Lexicographical_order)
If you do not want to create a separate module for the order (for example because you can't come out with a good name for it), then you can use anonymous modules, like this
let empty' = Set.empty (module struct
include Pair
include Base.Comparator.Make(Pair)
end)
Note, that the Pair module, passed to the Base.Comparator.Make functor has to be bound on the global scope, otherwise, the typechecker will complain. This is all about this witness value. So what this witness is about and what it witnesses.
The semantics of any ordered data structure, like Map or Set, depends on the order function. It is an error to compare two sets which was built with different orders, e.g., if you have two sets built from the same numbers, but one with the ascending order and another with the descending order they will be treated as different sets.
Ideally, such errors should be prevented by the type checker. For that we need to encode the order, used to build the set, in the set's type. And this is what Base is doing, let's look into the empty' type,
val empty' : (int * int, Comparator.Make(Pair).comparator_witness) Set.t
and the empty type
val empty : (Lexicographical_order.t, Lexicographical_order.comparator_witness) Set.t
Surprisingly, the compiler is able to see through the name differences (because modules have structural typing) and understand that Lexicographical_order.comparator_witness and Comparator.Make(Pair).comparator_witness are witnessing the same order, so we can even compare empty and empty',
# Set.equal empty empty';;
- : bool = true
To solidify our knowledge lets build a set of pairs in the reversed order,
module Reversed_lexicographical_order = struct
include Pair
include Base.Comparator.Make(Pair_reveresed_compare)
end
let empty_reveresed =
Set.empty (module Reversed_lexicographical_order)
(* the same, but with the anonyumous comparator *)
let empty_reveresed' = Set.empty (module struct
include Pair
include Base.Comparator.Make(Pair_reveresed_compare)
end)
As before, we can compare different variants of reversed sets,
# Set.equal empty_reversed empty_reveresed';;
- : bool = true
But comparing sets with different orders is prohibited by the type checker,
# Set.equal empty empty_reveresed;;
Characters 16-31:
Set.equal empty empty_reveresed;;
^^^^^^^^^^^^^^^
Error: This expression has type
(Reversed_lexicographical_order.t,
Reversed_lexicographical_order.comparator_witness) Set.t
but an expression was expected of type
(Lexicographical_order.t, Lexicographical_order.comparator_witness) Set.t
Type
Reversed_lexicographical_order.comparator_witness =
Comparator.Make(Pair_reveresed_compare).comparator_witness
is not compatible with type
Lexicographical_order.comparator_witness =
Comparator.Make(Pair).comparator_witness
This is what comparator witnesses are for, they prevent very nasty errors. And yes, it requires a little bit of more typing than in F# but is totally worthwhile as it provides more typing from the type checker that is now able to detect real problems.
A couple of final notes. The word "comparator" is an evolving concept in Janestreet libraries and previously it used to mean a different thing. The interfaces are also changing, like the example that #glennsl provides is a little bit outdated, and uses the Comparable.Make module instead of the new and more versatile Base.Comparator.Make.
Also, sometimes the compiler will not be able to see the equalities between comparators when types are abstracted, in that case, you will need to provide sharing constraints in your mli file. You can take the Bitvec_order library as an example. It showcases, how comparators could be used to define various orders of the same data structure and how sharing constraints could be used. The library documentation also explains various terminology and gives a history of the terminology.
And finally, if you're wondering how to enable the deriving preprocessors, then
for dune, add (preprocess (pps ppx_jane)) stanza to your library/executable spec
for ocamlbuild add -pkg ppx_jane option;
for topelevel (e.g., ocaml or utop) use #require "ppx_jane";; (if require is not available, then do #use "topfind;;", and then repeat).
There are examples in the documentation for Map showing exactly this.
If you use their PPXs you can just do:
module IntPair = struct
module T = struct
type t = int * int [##deriving sexp_of, compare]
end
include T
include Comparable.Make(T)
end
otherwise the full implementation is:
module IntPair = struct
module T = struct
type t = int * int
let compare x y = Tuple2.compare Int.compare Int.compare
let sexp_of_t = Tuple2.sexp_of_t Int.sexp_of_t Int.sexp_of_t
end
include T
include Comparable.Make(T)
end
Then you can create an empty set using this module:
let int_pair_set = Set.empty (module IntPair)

List.sum in Core, don't understand containers

I'm trying to understand List.sum from Jane streets core. I got it to work on a simple list of integers, but don't understand the concepts of Core's containers, and find the api documentation to terse to understand. Here's some code that works:
#require "core";;
open Core;;
List.sum (module Int) [1;2;3] ~f:ident;;
- : int = 6
#show List.sum;;
val sum :
(module Base__.Container_intf.Summable with type t = 'sum) ->
'a list -> f:('a -> 'sum) -> 'sum
Why do I have to use module Int and the identity function. [1;2;3] already provides a type of int list. Is there any good information about the design ideas behind Core?
The module provides the means of summing the values in question. The f provides a transformation function from the type of elements in the list to the type of elements you want to sum.
If all you want want to do is sum the integers in a list, then the summation function desired is in the Int module (thus we need module Int) and the transformation function is just ident (because we needn't transform the values at all).
However, what if you wanted obtain a sum of integers, but starting with a list of strings representing integers? Then we would have
utop # List.sum (module Int) ["1";"2";"3";"4"];;
- : f:(string -> int) -> int = <fun>
i.e., if we want to sum using the module Int over a list of strings, then we'll first need a function that will convert each value of type string to a value of type int. Thus:
utop # List.sum (module Int) ["1";"2";"3";"4"] ~f:Int.of_string;;
- : int = 10
This is pretty verbose, but it gives us a lot of flexibility! Imagine trying to sum using a different commutative operation, perhaps over a particular field in a record.
However, this is not the idiomatic way to sum a list of integers in OCaml. List.sum is a specific function which the List module "inherits" by virtue of it satisfying the a container interface used in the library design of Base (which provides the basic functionality of Core. The reason this function is relatively complex to use is because it is the result of a highly generalized design over algebraic structures (in this case, over collections of elements which can be transformed in elements which have a commutative operation defined over them).
For mundane integer summation, OCamlers just use a simple fold:
utop # List.fold [1;2;3;4] ~init:0 ~f:(+);;
- : int = 10
One good place to look for some insight into the design decisions behind Core is https://dev.realworldocaml.org/ . Another good resource is the Janestreet tech blog. You might also consult the Base repo (https://github.com/janestreet/base) or post a question asking for more specific details on the design philosophy in https://discuss.ocaml.org/
Janestreet's libraries have been notoriously opaque to newcomers, but they are getting a lot better, and the community will be happy to help you learn.
Tho the documentation is terse, it is very expressive. In particular, it tends to rely on the types to carry much of the weight, which means the code is largely self-documenting. It takes some practice to learn to read the types well, but this is well worth the effort, imo, and carries its own rewards!

In Elm, how to use comparable type in a tagged unions types?

I can define a tagged unions type like that:
type Msg
= Sort (Product -> Float)
But I cannot define it like:
type Msg
= Sort (Product -> comparable)
The error says:
Type Msg must declare its use of type variable comparable...
But comparable is a pre-defined type variable, right?
How do I fix this?
This question feels a little like an XY Problem. I'd like to offer a different way of thinking about passing sorting functions around in your message (with the caveat that I'm not familiar with your codebase, only the examples you've given in your question).
Adding a type parameter to Msg does seem a bit messy so let's take a step back. Sorting involves comparing two of the same types in a certain way and returning whether the first value is less than, equal to, or greater than the second. Elm already has an Order type using for comparing things which has the type constructors LT, EQ, and GT (for Less Than, EQual, and Greater Than).
Let's refactor your Msg into the following:
type Msg
= Sort (Product -> Product -> Order)
Now we don't have to add a type parameter to Msg. But how, then, do we specify which field of Product to sort by? We can use currying for that. Here's how:
Let's define another function called comparing which takes a function as its first argument and two other arguments of the same type, and return an Order value:
comparing : (a -> comparable) -> a -> a -> Order
comparing f x y =
compare (f x) (f y)
Notice the first argument is a function that looks similar to what your example was trying to attempt in the (Product -> comparable) argument of the Sort constructor. That's no coincidence. Now, by using currying, we can partially apply the comparing function with a record field getter, like .name or .price. To amend your example, the onClick handler could look like this:
onClick (Sort (comparing .name))
If you go this route, there will be more refactoring. Now that you have this comparison function, how do you use it in your update function? Let's assume your Model has a field called products which is of type List Product. In that case, we can just use the List.sortWith function to sort our list. Your update case for the Sort Msg would look something like this:
case msg of
Sort comparer ->
{ model | products = List.sortWith comparer model.products } ! []
A few closing thoughts and other notes:
This business about a comparing function comes straight from Haskell where it fulfills the same need.
Rather than defining the Sort constructor as above, I would probably abstract it out a little more since it is such a common idiom. You could define an alias for a generalized function like this, then redefine Msg as shown here:
type alias Comparer a =
a -> a -> Order
type Msg
= Sort (Comparer Product)
And to take it one step further just to illustrate how this all connects, the following two type annotations for comparing are identical:
-- this is the original example from up above
comparing : (a -> comparable) -> a -> a -> Order
-- this example substitutues the `Comparer a` alias, which may help further
-- your understanding of how it all ties together
comparing : (a -> comparable) -> Comparer a
The error you're getting is saying that comparable is an unbound variable type. You need to either fully specify it on the right hand side (e.g. Product -> Int) or specify you would like it to be polymorphic on the left hand side. Something like this:
type Msg a = Sort (Product -> a)
The question you ask about comparable is answered here: What does comparable mean in Elm?

Is there a nice way to use `->` directly as a function in Idris?

One can return a type in a function in Idris, for example
t : Type -> Type -> Type
t a b = a -> b
But the situation came up (when experimenting with writing some parsers) that I wanted to use -> to fold a list of types, ie
typeFold : List Type -> Type
typeFold = foldr1 (->)
So that typeFold [String, Int] would give String -> Int : Type. This doesn't compile though:
error: no implicit arguments allowed
here, expected: ")",
dependent type signature,
expression, name
typeFold = foldr1 (->)
^
But this works fine:
t : Type -> Type -> Type
t a b = a -> b
typeFold : List Type -> Type
typeFold = foldr1 t
Is there a better way to work with ->, and if not is it worth raising as a feature request?
The problem with using -> in this way is that it's not a type constructor but a binder, where the name bound for the domain is in scope in the range, so -> itself doesn't have a type directly. Your definition of t for example wouldn't capture a dependent type like (x : Nat) -> P x.
While it is a bit fiddly, what you're doing is the right way to do this. I'm not convinced we should make special syntax for (->) as a type constructor - partly because it really isn't one, and partly because it feels like it would lead to more confusion when it doesn't work with dependent types.
The Data.Morphisms module provides something like this, except you have to do all the wrapping/unwrapping around the Morphism "newtype".