I need to persist an abstract syntax tree represented using F# discriminated unions to a human readable compact format, such as the format already used in the F# language to construct discriminated unions, that I can read back in to discriminated union instances later. I'm a bit surprised the F# library doesn't support this as it surely must be done in an efficient way already in the compiler and F# interactive.
Are there any free/open source implementations out there for doing this in a reasonable (but not necessarily extremely) efficient manner?
Note: I do not want an XML based serialization.
EDIT: Neither of the answers below really meet your criteria, but I'm posting in case others looking for union-serialization find them useful. I don't know offhand of any library way to re-parse the output of sprintf "%A" on a union - keep in mind that the compiler and FSI have a much different task, knowing scopes and dealing with namespaces and qualified names and shadowing and whatnot, and even ignoring that, parsing the data carried by the unions (ints, strings, arbitrary objects, etc.) is potentially a whole task in itself.
Here is one strategy for union serialization, as part of a small sample program. (KnownTypeAttribute can take a method name, and you can use some reflection to get the types.) This is a very easy way to add a tiny bit of code to a union to get serialization.
open Microsoft.FSharp.Reflection
open System.Reflection
open System.Runtime.Serialization
open System.Xml
[<KnownType("KnownTypes")>]
type Union21WithKnownTypes =
| Case1 of int * int
| Case2 of string
static member KnownTypes() =
typeof<Union21WithKnownTypes>.GetNestedTypes(
BindingFlags.Public
||| BindingFlags.NonPublic) |> Array.filter FSharpType.IsUnion
let dcs = new DataContractSerializer(typeof<Union21WithKnownTypes[]>)
let arr = [| Case1(1,1); Case2("2") |]
printfn "orig data: %A" arr
let sb = new System.Text.StringBuilder()
let xw = XmlWriter.Create(sb)
dcs.WriteObject(xw, arr)
xw.Close()
let s = sb.ToString()
printfn ""
printfn "encoded as: %s" s
printfn ""
let xr = XmlReader.Create(new System.IO.StringReader(s))
let o = dcs.ReadObject(xr)
printfn "final data: %A" o
Here's the JSON version:
open Microsoft.FSharp.Reflection
open System.Reflection
open System.Runtime.Serialization
open System.Runtime.Serialization.Json
open System.Xml
[<KnownType("KnownTypes")>]
type Union21WithKnownTypes =
| Case1 of int * int
| Case2 of string
static member KnownTypes() =
typeof<Union21WithKnownTypes>.GetNestedTypes(
BindingFlags.Public
||| BindingFlags.NonPublic) |> Array.filter FSharpType.IsUnion
let dcs = new DataContractJsonSerializer(typeof<Union21WithKnownTypes[]>)
let arr = [| Case1(1,1); Case2("2") |]
printfn "orig data: %A" arr
let stream = new System.IO.MemoryStream()
dcs.WriteObject(stream, arr)
stream.Seek(0L, System.IO.SeekOrigin.Begin) |> ignore
let bytes = Array.create (int stream.Length) 0uy
stream.Read(bytes, 0, int stream.Length) |> ignore
let s = System.Text.Encoding.ASCII.GetString(bytes)
printfn ""
printfn "encoded as: %s" s
printfn ""
stream.Seek(0L, System.IO.SeekOrigin.Begin) |> ignore
let o = dcs.ReadObject(stream)
printfn "final data: %A" o
Related
There are many examples of how to read from and write to files, but many posts seem out of date, are too complicated, or are not 'safe' (1, 2) (they throw/raise exceptions). Coming from Rust, I'd like to explicitly handle all errors with something monadic like result.
Below is an attempt that is 'safe-er' because an open and read/write will not throw/raise. But not sure whether the close can fail. Is there a more concise and potentially safer way to do this?
(* opam install core batteries *)
open Stdio
open Batteries
open BatResult.Infix
let read_safe (file_path: string): (string, exn) BatPervasives.result =
(try let chan = In_channel.create file_path in Ok(chan)
with (e: exn) -> Error(e))
>>= fun chan ->
let res_strings =
try
let b = In_channel.input_lines chan in
Ok(b)
with (e: exn) -> Error(e) in
In_channel.close chan;
BatResult.map (fun strings -> String.concat "\n" strings) res_strings
let write_safe (file_path: string) (text: string) : (unit, exn) BatPervasives.result =
(try
(let chan = Out_channel.create file_path in Ok(chan))
with (e: exn) -> Error(e))
>>= fun chan ->
let res =
(try let b = Out_channel.output_string chan text in Ok(b)
with (e: exn) -> Error(e)) in
Out_channel.close chan;
res
let () =
let out =
read_safe "test-in.txt"
>>= fun str -> write_safe "test-out.txt" str in
BatResult.iter_error (fun e -> print_endline (Base.Exn.to_string e)) out
The Stdio library, which is a part of the Janestreet industrial-strength standard library, already provides such functions, which are, of course safe, e.g., In_channel.read_all reads the contents of the file to a string and corresponding Out_channel.write_all writes it to a file, so we can implement a cp utility as,
(* file cp.ml *)
(* file cp.ml *)
open Base
open Stdio
let () = match Sys.get_argv () with
| [|_cp; src; dst |] ->
Out_channel.write_all dst
~data:(In_channel.read_all src)
| _ -> invalid_arg "Usage: cp src dst"
To build and run the code, put it in the cp.ml file (ideally in a fresh new directory), and run
dune init exe cp --libs=base,stdio
this command will bootstrap your project using dune. Then you can run your program with
dune exec ./cp.exe cp.ml cp.copy.ml
Here is the link to the OCaml Documentation Hub that will make it easier for you to find interesting libraries in OCaml.
Also, if you want to turn a function that raises an exception to a function that returns an error instead, you can use Result.try_with, e.g.,
let safe_read file = Result.try_with ## fun () ->
In_channel.read_all file
You can read and write files in OCaml without needing alternative standard libraries. Everything you need is already built into Stdlib which ships with OCaml.
Here's an example of reading a file while ensuring the file descriptor gets closed safely in case of an exception: https://stackoverflow.com/a/67607879/20371 . From there you can write a similar function to write a file using the corresponding functions open_out, out_channel_length, and output.
These read and write file contents as OCaml's bytes type, i.e. mutable bytestrings. However, they may throw exceptions. This is fine. In OCaml exceptions are cheap and easy to handle. Nevertheless, sometimes people don't like them for whatever reason. So it's a bit of a convention nowadays to suffix functions which throw exceptions with _exn. So suppose you define the above-mentioned two functions as such:
val get_contents_exn : string -> bytes
val set_contents_exn : string -> bytes -> unit
Now it's easy for you (or anyone) to wrap them and return a result value, like Rust. But, since we have polymorphic variants in OCaml, we take advantage of that to compose together functions which can return result values, as described here: https://keleshev.com/composable-error-handling-in-ocaml
So you can wrap them like this:
let get_contents filename =
try Ok (get_contents_exn filename) with exn -> Error (`Exn exn)
let set_contents filename contents =
try Ok (set_contents_exn filename contents) with exn -> Error (`Exn exn)
Now these have the types:
val get_contents : string -> (bytes, [> `Exn of exn]) result
val set_contents : string -> bytes -> (unit, [> `Exn of exn]) result
And they can be composed together with each other and other functions which return result values with a polymorphic variant error channel.
One point I am trying to make here is to offer your users both, so they can choose whichever way–exceptions or results–makes sense for them.
Here's the full safe solution based on #ivg answer, using only the Base library.
open Base
open Base.Result
open Stdio
let read_safe (file_path: string) =
Result.try_with ## fun () ->
In_channel.read_all file_path
let write_safe (file_path: string) (text: string) =
Result.try_with ## fun () ->
Out_channel.write_all ~data:text file_path
let () =
let out =
read_safe "test-in.txt"
>>= fun str ->
write_safe "test-out.txt" str in
iter_error out ~f:(fun e -> print_endline (Base.Exn.to_string e))
I created a generator to generate lists of int with the same lenght and to test the property of zip and unzip.
Running the test I get once in a while the error
Error: System.ArgumentException: list2 is 1 element shorter than list1
but it shouldn't happen because of my generator.
I got three times the test 100% passed and then the error above. Why?
It seems my generator is not working properly.
let samelength (x, y) =
List.length x = List.length y
let arbMyGen2 = Arb.filter samelength Arb.from<int list * int list>
type MyGenZ =
static member genZip() =
{
new Arbitrary<int list * int list>() with
override x.Generator = arbMyGen2 |> Arb.toGen
override x.Shrinker t = Seq.empty
}
let _ = Arb.register<MyGenZ>()
let pro_zip (xs: int list, ys: int list) =
(xs, ys) = List.unzip(List.zip xs ys)
|> Prop.collect (List.length xs = List.length ys)
do Check.Quick pro_zip
Your code, as written, works for me. So I'm not sure what exactly is wrong, but I can give you a few helpful (hopefully!) hints.
In the first instance, try not using the registrating mechanism, but instead using Prop.forAll, as follows:
let pro_zip =
Prop.forAll arbMyGen2 (fun (xs,ys) ->
(xs, ys) = List.unzip(List.zip xs ys)
|> Prop.collect (List.length xs))
do Check.Quick pro_zip
Note I've also changed your Prop.collect call to collect the length of the list(s), which gives somewhat more interesting output. In fact your property already checks that the lists are the same length (albeit implicitly) so the test will fail with a counterexample if they are not.
Arb.filter transforms an existing Arbitrary (i.e. generator and filter) to a new Arbitrary. In other words, arbMyGen2 has a shrinking function that'll work (i.e. only returns smaller pairs of lists that are of equal length), while in genZip() you throw the shrinker away. It would be fine to simply write
type MyGenZ =
static member genZip() = arbMyGen2
instead.
I'm still trying to figure out how to split code when using mirage and it's myriad of first class modules.
I've put everything I need in a big ugly Context module, to avoid having to pass ten modules to all my functions, one is pain enough.
I have a function to receive commands over tcp :
let recvCmds (type a) (module Ctx : Context with type chan = a) nodeid chan = ...
After hours of trial and errors, I figured out that I needed to add (type a) and the "explicit" type chan = a to make it work. Looks ugly, but it compiles.
But if I want to make that function recursive :
let rec recvCmds (type a) (module Ctx : Context with type chan = a) nodeid chan =
Ctx.readMsg chan >>= fun res ->
... more stuff ...
|> OtherModule.getStorageForId (module Ctx)
... more stuff ...
recvCmds (module Ctx) nodeid chan
I pass the module twice, the first time no problem but
I get an error on the recursion line :
The signature for this packaged module couldn't be inferred.
and if I try to specify the signature I get
This expression has type a but an expression was expected of type 'a
The type constructor a would escape its scope
And it seems like I can't use the whole (type chan = a) thing.
If someone could explain what is going on, and ideally a way to work around it, it'd be great.
I could just use a while of course, but I'd rather finally understand these damn modules. Thanks !
The pratical answer is that recursive functions should universally quantify their locally abstract types with let rec f: type a. .... = fun ... .
More precisely, your example can be simplified to
module type T = sig type t end
let rec f (type a) (m: (module T with type t = a)) = f m
which yield the same error as yours:
Error: This expression has type (module T with type t = a)
but an expression was expected of type 'a
The type constructor a would escape its scope
This error can be fixed with an explicit forall quantification: this can be done with
the short-hand notation (for universally quantified locally abstract type):
let rec f: type a. (module T with type t = a) -> 'never = fun m -> f m
The reason behind this behavior is that locally abstract type should not escape
the scope of the function that introduced them. For instance, this code
let ext_store = ref None
let store x = ext_store := Some x
let f (type a) (x:a) = store x
should visibly fail because it tries to store a value of type a, which is a non-sensical type outside of the body of f.
By consequence, values with a locally abstract type can only be used by polymorphic function. For instance, this example
let id x = x
let f (x:a) : a = id x
is fine because id x works for any x.
The problem with a function like
let rec f (type a) (m: (module T with type t = a)) = f m
is then that the type of f is not yet generalized inside its body, because type generalization in ML happens at let definition. The fix is therefore to explicitly tell to the compiler that f is polymorphic in its argument:
let rec f: 'a. (module T with type t = 'a) -> 'never =
fun (type a) (m:(module T with type t = a)) -> f m
Here, 'a. ... is an universal quantification that should read forall 'a. ....
This first line tells to the compiler that the function f is polymorphic in its first argument, whereas the second line explicitly introduces the locally abstract type a to refine the packed module type. Splitting these two declarations is quite verbose, thus the shorthand notation combines both:
let rec f: type a. (module T with type t = a) -> 'never = fun m -> f m
I get type errors when chaining different types of Iterator.
let s = Some(10);
let v = (1..5).chain(s.iter())
.collect::<Vec<_>>();
Output:
<anon>:23:20: 23:35 error: type mismatch resolving `<core::option::Iter<'_, _> as core::iter::IntoIterator>::Item == _`:
expected &-ptr,
found integral variable [E0271]
<anon>:23 let v = (1..5).chain(s.iter())
^~~~~~~~~~~~~~~
<anon>:23:20: 23:35 help: see the detailed explanation for E0271
<anon>:24:14: 24:33 error: no method named `collect` found for type `core::iter::Chain<core::ops::Range<_>, core::option::Iter<'_, _>>` in the current scope
<anon>:24 .collect::<Vec<_>>();
^~~~~~~~~~~~~~~~~~~
<anon>:24:14: 24:33 note: the method `collect` exists but the following trait bounds were not satisfied: `core::iter::Chain<core::ops::Range<_>, core::option::Iter<'_, _>> : core::iter::Iterator`
error: aborting due to 2 previous errors
But it works fine when zipping:
let s = Some(10);
let v = (1..5).zip(s.iter())
.collect::<Vec<_>>();
Output:
[(1, 10)]
Why is Rust able to infer the correct types for zip but not for chain and how can I fix it? n.b. I want to be able to do this for any iterator, so I don't want a solution that just works for Range and Option.
First, note that the iterators yield different types. I've added an explicit u8 to the numbers to make the types more obvious:
fn main() {
let s = Some(10u8);
let r = (1..5u8);
let () = s.iter().next(); // Option<&u8>
let () = r.next(); // Option<u8>
}
When you chain two iterators, both iterators must yield the same type. This makes sense as the iterator cannot "switch" what type it outputs when it gets to the end of one and begins on the second:
fn chain<U>(self, other: U) -> Chain<Self, U::IntoIter>
where U: IntoIterator<Item=Self::Item>
// ^~~~~~~~~~~~~~~ This means the types must match
So why does zip work? Because it doesn't have that restriction:
fn zip<U>(self, other: U) -> Zip<Self, U::IntoIter>
where U: IntoIterator
// ^~~~ Nothing here!
This is because zip returns a tuple with one value from each iterator; a new type, distinct from either source iterator's type. One iterator could be an integral type and the other could return your own custom type for all zip cares.
Why is Rust able to infer the correct types for zip but not for chain
There is no type inference happening here; that's a different thing. This is just plain-old type mismatching.
and how can I fix it?
In this case, your inner iterator yields a reference to an integer, a Clone-able type, so you can use cloned to make a new iterator that clones each value and then both iterators would have the same type:
fn main() {
let s = Some(10);
let v: Vec<_> = (1..5).chain(s.iter().cloned()).collect();
}
If you are done with the option, you can also use a consuming iterator with into_iter:
fn main() {
let s = Some(10);
let v: Vec<_> = (1..5).chain(s.into_iter()).collect();
}
I find the following C# extension method very useful:
public static bool In<T>(this T x, params T[] xs)
{
return xs.Contains(x);
}
allowing for C# calls such as
var s = "something else";
var rslt = s.In("this","that","other") ? "Yay" : "Boo";
and
var i = 1;
var rslt = i.In(1,2,3) ? "Yay" : "Boo";
I have been trying to come up with an F# (near-)equivalent, allowing e.g.:
let s = "something else"
let rslt = if s.In("this","that","other") then "Yay" else "Boo"
It seems like I would need something like:
type 'T with
static member this.In([ParamArray] xs : 'T )
{
return xs.Contains(x);
}
but that is not legal F# syntax. I can't see how to declare a extension method on a generic class in F#. Is it possible? Or is there a better way to achieve similar results? (I imagine I could just link in the C# project and call it from F#, but that would be cheating! :-)
The best I could come up with was:
let inline In (x : 'a, [<ParamArray>] xs : 'a[]) = Array.Exists( xs, (fun y -> x = y) )
which I expected to allow for calls like (which are not really acceptable anyway imho):
if In(ch, '?', '/') then "Yay" else "Boo"
but in fact required:
if In(ch, [| '?'; '/' |]) then "Yay" else "Boo"
implying that the ParamArray attribute is being ignored (for reasons I've yet to fathom).
Fwiw, the latest version of F# (3.1) contains exactly what I was after (yay!):
[<Extension>]
type ExtraCSharpStyleExtensionMethodsInFSharp () =
[<Extension>]
static member inline In(x: 'T, xs: seq<'T>) = xs |> Seq.exists (fun o -> o = x)
[<Extension>]
static member inline Contains(xs: seq<'T>, x: 'T) = xs |> Seq.exists (fun o -> o = x)
[<Extension>]
static member inline NotIn(x: 'T, xs: seq<'T>) = xs |> Seq.forall (fun o -> o <> x)
providing usages as
if s.In(["this","that","other"]) then ....
if (["this","that","other"]).Contains(s) then ...
etc.