How to enable hints and warnings in the online REPL - frege

I figured that I can do it on the command line REPL like so:
java -jar frege-repl-1.0.3-SNAPSHOT.jar -hints -warnings
But how can I do the same in http://try.frege-lang.org

Hints and warnings are already enabled by default. For example,
frege> f x = f x
function f :: α -> β
3: application of f will diverge.
Perhaps we can make it better by explicitly saying it as warning or hint (instead of colors distinguishing them) something like:
[Warning] 3: application of f will diverge.
and providing an option to turn them on/off.
Update:
There was indeed an issue (Thanks Ingo for pointing that out!) with showing warnings that are generated in a later phase during the compilation. This issue has been fixed and the following examples now correctly display warnings in the REPL:
frege> h x = 0; h false = 42
function h :: Bool -> Int
4: equation or case alternative cannot be reached.
frege> f false = 6
function f :: Bool -> Int
5: function pattern is refutable, consider
adding a case for true

Related

Purpose of anonymous modules in Agda

Going at the root of Agda standard library, and issuing the following command:
grep -r "module _" . | wc -l
Yields the following result:
843
Whenever I encounter such anonymous modules (I assume that's what they are called), I quite cannot figure out what their purpose is, despite of their apparent ubiquity, nor how to use them because, by definition, I can't access their content using their name, although I assume this should be possible, otherwise their would be no point in even allowing them to be defined.
The wiki page:
https://agda.readthedocs.io/en/v2.6.1/language/module-system.html#anonymous-modules
has a section called "anonymous modules" which is in fact empty.
Could somebody explain what the purpose of anonymous modules is ?
If possible, any example to emphasize the relevance of the definition of such modules, as well as how to use their content would be very much appreciated.
Here are the possible ideas I've come up with, but none of them seems completely satisfying:
They are a way to regroup thematically identical definitions inside an Agda file.
Their name is somehow infered by Agda when using the functions they provide.
Their content is only meant to be visible / used inside their englobing module (a bit like a private block).
Anonymous modules can be used to simplify a group of definitions which share some arguments. Example:
open import Data.Empty
open import Data.Nat
<⇒¬≥ : ∀ {n m} → n < m → n ≥ m → ⊥
<⇒¬≥ = {!!}
<⇒> : ∀ {n m} → n < m → m > n
<⇒> = {!!}
module _ {n m} (p : n < m) where
<⇒¬≥′ : n ≥ m → ⊥
<⇒¬≥′ = {!!}
<⇒>′ : m > n
<⇒>′ = {!!}
Afaik this is the only use of anonymous modules. When the module _ scope is closed, you can't refer to the module anymore, but you can refer to its definitions as if they hadn't been defined in a module at all (but with extra arguments instead).

Why syntax error in this very simple print command

I am trying to run following very simple code:
open Str
print (Str.first_chars "testing" 0)
However, it is giving following error:
$ ocaml testing2.ml
File "testing2.ml", line 2, characters 0-5:
Error: Syntax error
There are no further details in the error message.
Same error with print_endline also; or even if no print command is there. Hence, the error is in part: Str.first_chars "testing" 0
Documentation about above function from here is as follows:
val first_chars : string -> int -> string
first_chars s n returns the first n characters of s. This is the same
function as Str.string_before.
Adding ; or ;; at end of second statement does not make any difference.
What is the correct syntax for above code.
Edit:
With following code as suggested by #EvgeniiLepikhin:
open Str
let () =
print_endline (Str.first_chars "testing" 0)
Error is:
File "testing2.ml", line 1:
Error: Reference to undefined global `Str'
And with this code:
open Str;;
print_endline (Str.first_chars "testing" 0)
Error is:
File "testing2.ml", line 1:
Error: Reference to undefined global `Str'
With just print command (instead of print_endline) in above code, the error is:
File "testing2.ml", line 2, characters 0-5:
Error: Unbound value print
Note, my Ocaml version is:
$ ocaml -version
The OCaml toplevel, version 4.02.3
I think Str should be built-in, since opam is not finding it:
$ opam install Str
[ERROR] No package named Str found.
I also tried following code as suggested in comments by #glennsl:
#use "topfind"
#require "str"
print (Str.first_chars "testing" 0)
But this also give same simple syntax error.
An OCaml program is a list of definitions, which are evaluated in order. You can define values, modules, classes, exceptions, as well as types, module types, class types. But let's focus on values so far.
In OCaml, there are no statements, commands, or instructions. It is a functional programming language, where everything is an expression, and when an expression is evaluated it produces a value. The value could be bound to a variable so that it could be referenced later.
The print_endline function takes a value of type string, outputs it to the standard output channel and returns a value of type unit. Type unit has only one value called unit, which could be constructed using the () expression. For example, print_endline "hello, world" is an expression that produces this value. We can't just throw an expression in a file and hope that it will be compiled, as an expression is not a definition. The definition syntax is simple,
let <pattern> = <expr>
where is either a variable or a data constructor, which will match with the structure of the value that is produced by <expr> and possibly bind variable, that are occurring in the pattern, e.g., the following are definitions
let x = 7 * 8
let 4 = 2 * 2
let [x; y; z] = [1; 2; 3]
let (hello, world) = "hello", "world"
let () = print_endline "hello, world"
You may notice, that the result of the print_endline "hello, world" expression is not bound to any variable, but instead is matched with the unit value (), which could be seen (and indeed looks like) an empty tuple. You can write also
let x = print_endline "hello, world"
or even
let _ = print_endline "hello, world"
But it is always better to be explicit on the left-hand side of a definition in what you're expecting.
So, now the well-formed program of ours should look like this
open Str
let () =
print_endline (Str.first_chars "testing" 0)
We will use ocamlbuild to compile and run our program. The str module is not a part of the standard library so we have to tell ocamlbuild that we're going to use it. We need to create a new folder and put our program into a file named example.ml, then we can compile it using the following command
ocamlbuild -pkg str example.native --
The ocamlbuild tool will infer from the suffix native what is your goal (in this case it is to build a native code application). The -- means run the built application as soon as it is compiled. The above program will print nothing, of course, here is an example of a program that will print some greeting message, before printing the first zero characters of the testing string,
open Str
let () =
print_endline "The first 0 chars of 'testing' are:";
print_endline (Str.first_chars "testing" 0)
and here is how it works
$ ocamlbuild -package str example.native --
Finished, 4 targets (4 cached) in 00:00:00.
The first 0 chars of 'testing' are:
Also, instead of compiling your program and running the resulting application, you can interpret your the example.ml file directly, using the ocaml toplevel tool, which provides an interactive interpreter. You still need to load the str library into the toplevel, as it is not a part of the standard library which is pre-linked in it, here is the correct invocation
ocaml str.cma example.ml
You should add ;; after "open Str":
open Str;;
print (Str.first_chars "testing" 0)
Another option is to declare code block:
open Str
let () =
print (Str.first_chars "testing" 0)

Redefining operator precedence in a module for exported predicate

I would like to write a module that exports a predicate where the user should be able to access a predicate p/1 as a prefix operator. I have defined the following module:
:- module(lala, [p/1]).
:- op(500, fy, [p]).
p(comment).
p(ca).
p(va).
and load it now via:
?- use_module(lala).
true.
Unfortunately, a query fails:
?- p X.
ERROR: Syntax error: Operator expected
ERROR: p
ERROR: ** here **
ERROR: X .
After setting the operator precedence properly, everything works:
?- op(500, fy, [p]).
true.
?- p X.
X = comment ;
X = ca ;
X = va.
I used SWI Prolog for my output but the same problem occurs in YAP as well (GNU Prolog does not support modules). Is there a way the user does not need to set the precedence themselves?
You can export the operator with the module/2 directive.
For example:
:- module(lala, [p/1,
op(500, fy, p)]).
Since the operator is then also available in the module, you can write for example:
p comment.
p ça.
p va.
where p is used as a prefix operator.

How to prevent common sub-expression elimination (CSE) with GHC

Given the program:
import Debug.Trace
main = print $ trace "hit" 1 + trace "hit" 1
If I compile with ghc -O (7.0.1 or higher) I get the output:
hit
2
i.e. GHC has used common sub-expression elimination (CSE) to rewrite my program as:
main = print $ let x = trace "hit" 1 in x + x
If I compile with -fno-cse then I see hit appearing twice.
Is it possible to avoid CSE by modifying the program? Is there any sub-expression e for which I can guarantee e + e will not be CSE'd? I know about lazy, but can't find anything designed to inhibit CSE.
The background of this question is the cmdargs library, where CSE breaks the library (due to impurity in the library). One solution is to ask users of the library to specify -fno-cse, but I'd prefer to modify the library.
How about removing the source of the trouble -- the implicit effect -- by using a sequencing monad that introduces that effect? E.g. the strict identity monad with tracing:
data Eval a = Done a
| Trace String a
instance Monad Eval where
return x = Done x
Done x >>= k = k x
Trace s a >>= k = trace s (k a)
runEval :: Eval a -> a
runEval (Done x) = x
track = Trace
now we can write stuff with a guaranteed ordering of the trace calls:
main = print $ runEval $ do
t1 <- track "hit" 1
t2 <- track "hit" 1
return (t1 + t2)
while still being pure code, and GHC won't try to get to clever, even with -O2:
$ ./A
hit
hit
2
So we introduce just the computation effect (tracing) sufficient to teach GHC the semantics we want.
This is extremely robust to compile optimizations. So much so that GHC optimizes the math to 2 at compile time, yet still retains the ordering of the trace statements.
As evidence of how robust this approach is, here's the core with -O2 and aggressive inlining:
main2 =
case Debug.Trace.trace string trace2 of
Done x -> case x of
I# i# -> $wshowSignedInt 0 i# []
Trace _ _ -> err
trace2 = Debug.Trace.trace string d
d :: Eval Int
d = Done n
n :: Int
n = I# 2
string :: [Char]
string = unpackCString# "hit"
So GHC has done everything it could to optimize the code -- including computing the math statically -- while still retaining the correct tracing.
References: the useful Eval monad for sequencing was introduced by Simon Marlow.
Reading the source code to GHC, the only expressions that aren't eligible for CSE are those which fail the exprIsBig test. Currently that means the Expr values Note, Let and Case, and expressions which contain those.
Therefore, an answer to the above question would be:
unit = reverse "" `seq` ()
main = print $ trace "hit" (case unit of () -> 1) +
trace "hit" (case unit of () -> 1)
Here we create a value unit which resolves to (), but which GHC can't determine the value for (by using a recursive function GHC can't optimise away - reverse is just a simple one to hand). This means GHC can't CSE the trace function and it's 2 arguments, and we get hit printed twice. This works with both GHC 6.12.4 and 7.0.3 at -O2.
I think you can specify the -fno-cse option in the source file, i.e. by putting a pragma
{-# OPTIONS_GHC -fno-cse #-}
on top.
Another method to avoid common subexpression elimination or let floating in general is to introduce dummy arguments. For example, you can try
let x () = trace "hi" 1 in x () + x ()
This particular example won't necessarily work; ideally, you should specify a data dependency via dummy arguments. For instance, the following is likely to work:
let
x dummy = trace "hi" $ dummy `seq` 1
x1 = x ()
x2 = x x1
in x1 + x2
The result of x now "depends" on the argument dummy and there is no longer a common subexpression.
I'm a bit unsure about Don's sequencing monad (posting this as answer because the site doesn't let me add comments). Modifying the example a bit:
main :: IO ()
main = print $ runEval $ do
t1 <- track "hit 1" (trace "really hit 1" 1)
t2 <- track "hit 2" 2
return (t1 + t2)
This gives us the following output:
hit 1
hit 2
really hit 1
That is, the first trace fires when the t1 <- ... statement is executed, not when t1 is actually evaluated in return (t1 + t2). If we define the monadic bind operator as
Done x >>= k = k x
Trace s a >>= k = k (trace s a)
instead, the output will reflect the actual evaluation order:
hit 1
really hit 1
hit 2
That is, the traces will fire when the (t1 + t2) statement is executed, which is (IMO) what we really want. For example, if we change (t1 + t2) to (t2 + t1), this solution produces the following output:
hit 2
really hit 2
hit 1
The output of the original version remains unchanged, and we don't see when our terms are really evaluated:
hit 1
hit 2
really hit 2
Like the original solution, this also works with -O3 (tested on GHC 7.0.3).

How to list directories faster?

I have a few situations where I need to list files recursively, but my implementations have been slow. I have a directory structure with 92784 files. find lists the files in less than 0.5 seconds, but my Haskell implementation is a lot slower.
My first implementation took a bit over 9 seconds to complete, next version a bit over 5 seconds and I'm currently down to a bit less than two seconds.
listFilesR :: FilePath -> IO [FilePath]
listFilesR path = let
isDODD "." = False
isDODD ".." = False
isDODD _ = True
in do
allfiles <- getDirectoryContents path
dirs <- forM allfiles $ \d ->
if isDODD d then
do let p = path </> d
isDir <- doesDirectoryExist p
if isDir then listFilesR p else return [d]
else return []
return $ concat dirs
The test takes about 100 megabytes of memory (+RTS -s), and the program spends around 40% in GC.
I was thinking of doing the listing in a WriterT monad with Sequence as the monoid to prevent the concats and list creation. Is it likely this helps? What else should I do?
Edit: I have edited the function to use readDirStream, and it helps keeping the memory down. There's still some allocation happening, but productivity rate is >95% now and it runs in less than a second.
This is the current version:
list path = do
de <- openDirStream path
readDirStream de >>= go de
closeDirStream de
where
go d [] = return ()
go d "." = readDirStream d >>= go d
go d ".." = readDirStream d >>= go d
go d x = let newpath = path </> x
in do
e <- doesDirectoryExist newpath
if e
then
list newpath >> readDirStream d >>= go d
else putStrLn newpath >> readDirStream d >>= go d
I think that System.Directory.getDirectoryContents constructs a whole list and therefore uses much memory. How about using System.Posix.Directory? System.Posix.Directory.readDirStream returns an entry one by one.
Also, FileManip library might be useful although I have never used it.
Profiling your code shows that most of the CPU time goes in getDirectoryContents, doesDirectoryExist and </>. This means that only changing the data structure won't help very much. If you want to match the performance of find you should use lower level functions for accessing the filesystem, probably the ones which Tsuyoshi pointed out.
One problem is that it has to construct the entire list of directory contents, before the program can do anything with them. Lazy IO is usually frowned upon, but using unsafeInterleaveIO here cut memory use significantly.
listFilesR :: FilePath -> IO [FilePath]
listFilesR path =
let
isDODD "." = False
isDODD ".." = False
isDODD _ = True
in unsafeInterleaveIO $ do
allfiles <- getDirectoryContents path
dirs <- forM allfiles $ \d ->
if isDODD d then
do let p = path </> d
isDir <- doesDirectoryExist p
if isDir then listFilesR p else return [d]
else return []
return $ concat dirs
Would it be an option to use some sort of cache system combined with the read? I was thinking of an async indexing service/thread that kept this cache up-to-date in the background, perhaps you could do the cache as a simple SQL-DB which would then give you some nice performance when doing queries against it?
Can you elaborate anything on your "project/idea" so we can come up with something alternative?
I wouldn't go for a "full index" myself as I mostly build webbased services and "resposnetime" is criticial to me, on the other hand - if its an initial way of starting up a new server I am sure the customers wouldnt mind waiting that first time. I would just store the result in the DB for later lookups.