Field update doesn't work when changing type parameters - record

I find that the record ⦇ field := value ⦈ notation does not work as expected when I use it to change record's type parameter in the process. To clarify what I mean, consider the following example:
theory Scratch imports
Main
begin
record 'a john =
apple :: 'a
banana :: int
definition f :: "nat john ⇒ bool john"
where
"f j ≡ j ⦇ apple := True ⦈"
definition f_fixed :: "nat john ⇒ bool john"
where
"f_fixed j ≡ ⦇ apple = True, banana = banana j ⦈"
end
The definition for f is not accepted due to a type error. It seems that I cannot use a field update to change the type of the underlying record. I'm mildly surprised at this, because f_fixed looks pretty similar, but works absolutely fine. Am I doing something wrong, or is this a known restriction?

This is a known restriction.
The syntax j⦇ apple := True ⦈ is shorthand for john.apple_update (λ_. True) j, where the function apply_update has type:
('a ⇒ 'a) ⇒ 'a john ⇒ 'a john
(The first parameter is a function that gets passed the old value of apple and returns a new value. Because you are unconditionally updating the field to True, your function is (λ_. True).)
The type of this function prevents the type 'a from changing during the update. What you presumably are looking for is an update function with the following type:
('a ⇒ 'b) ⇒ 'a john ⇒ 'b john
One possible complication of emitting such functions would be in the following scenario:
record 'a john =
apple :: 'a
banana :: 'a
What should the type of apple_update be? You can't change the type of apple without simultaneously changing the type of banana.
Again, there are no technical reasons why such functions couldn't be emitted by the record package, it is just a matter of determining what the correct functions to emit would be in all the various corner cases.

Related

Where is the Idris == operator useful?

As a beginner in type-driven programming, I'm curious about the use of the == operator. Examples demonstrate that it's not sufficient to prove equality between two values of a certain type, and special equality checking types are introduced for the particular data types. In that case, where is == useful at all?
(==) (as the single constituent function of the Eq interface) is a function from a type T to Bool, and is good for equational reasoning. Whereas x = y (where x : T and y : T) AKA "intensional equality" is itself a type and therefore a proposition. You can and often will want to bounce back and forth between the two different ways of expressing equality for a particular type.
x == y = True is also a proposition, and is often an intermediate step between reasoning about (==) and reasoning about =.
The exact relationship between the two types of equality is rather complex, and you can read https://github.com/pdorrell/learning-idris/blob/9d3454a77f6e21cd476bd17c0bfd2a8a41f382b7/finished/EqFromEquality.idr for my own attempt to understand some aspects of it. (One thing to note is that even though an inductively defined type will have decideable intensional equality, you still have to go through a few hoops to prove that, and a few more hoops to define a corresponding implementation of Eq.)
One particular handy code snippet is this:
-- for rel x y, provide both the computed value, and the proposition that it is equal to the value (as a dependent pair)
has_value_dpair : (rel : t -> t -> Bool) -> (x : t) -> (y : t) -> (value: Bool ** rel x y = value)
has_value_dpair rel x y = (rel x y ** Refl)
You can use it with the with construct when you have a value returned from rel x y and you want to reason about the proposition rel x y = True or rel x y = False (and rel is some function that might represent a notion of equality between x and y).
(In this answer I assume the case where (==) corresponds to =, but you are entirely free to define a (==) function that doesn't correspond to =, eg when defining a Setoid. So that's another reason to use (==) instead of =.)
You still need good old equality because sometimes you can't prove things. Sometimes you don't even need to prove. Consider next example:
countEquals : Eq a => a -> List a -> Nat
countEquals x = length . filter (== x)
You might want to just count number of equal elements to show some statistics to user. Another example: tests. Yes, even with strong type system and dependent types you might want to perform good old unit tests. So you want to check for expectations and this is rather convenient to do with (==) operator.
I'm not going to write full list of cases where you might need (==). Equality operator is not enough for proving but you don't always need proofs.

How to "read" Elm's -> operator

I'm really loving Elm, up to the point where I encounter a function I've never seen before and want to understand its inputs and outputs.
Take the declaration of foldl for example:
foldl : (a -> b -> b) -> b -> List a -> b
I look at this and can't help feeling as if there's a set of parentheses that I'm missing, or some other subtlety about the associativity of this operator (for which I can't find any explicit documentation). Perhaps it's just a matter of using the language more until I just get a "feel" for it, but I'd like to think there's a way to "read" this definition in English.
Looking at the example from the docs…
foldl (::) [] [1,2,3] == [3,2,1]
I expect the function signature to read something like this:
Given a function that takes an a and a b and returns a b, an additional b, and a List, foldl returns a b.
Is that correct?
What advice can you give to someone like me who desperately wants inputs to be comma-delineated and inputs/outputs to be separated more clearly?
Short answer
The missing parentheses you're looking for are due to the fact that -> is right-associative: the type (a -> b -> b) -> b -> List a -> b is equivalent to (a -> b -> b) -> (b -> (List a -> b)). Informally, in a chain of ->s, read everything before the last -> as an argument and only the rightmost thing as a result.
Long answer
The key insight you may be missing is currying -- the idea that if you have a function that takes two arguments, you can represent it with a function that takes the first argument and returns a function that takes the second argument and then returns the result.
For instance, suppose you have a function add that takes two integers and adds them together. In Elm, you could write a function that takes both elements as a tuple and adds them:
add : (Int, Int) -> Int
add (x, y) = x+y
and you could call it as
add (1, 2) -- evaluates to 3
But suppose you didn't have tuples. You might think that there would be no way to write this function, but in fact using currying you could write it as:
add : Int -> (Int -> Int)
add x =
let addx : Int -> Int
addx y = x+y
in
addx
That is, you write a function that takes x and returns another function that takes y and adds it to the original x. You could call it with
((add 1) 2) -- evaluates to 3
You can now think of add in two ways: either as a function that takes an x and a y and adds them, or as a "factory" function that takes x values and produces new, specialized addx functions that take just one argument and add it to x.
The "factory" way of thinking about things comes in handy every once in a while. For instance, if you have a list of numbers called numbers and you want to add 3 to each number, you can just call List.map (add 3) numbers; if you'd written the tuple version instead you'd have to write something like List.map (\y -> add (3,y)) numbers which is a bit more awkward.
Elm comes from a tradition of programming languages that really like this way of thinking about functions and encourage it where possible, so Elm's syntax for functions is designed to make it easy. To that end, -> is right-associative: a -> b -> c is equivalent to a -> (b -> c). This means if you don't parenthesize, what you're defining is a function that takes an a and returns a b -> c, which again we can think of either as a function that takes an a and a b and returns a c, or equivalently a function that takes an a and returns a b -> c.
There's another syntactic nicety that helps call these functions: function application is left-associative. That way, the ugly ((add 1) 2) from above can be written as add 1 2. With that syntax tweak, you don't have to think about currying at all unless you want to partially apply a function -- just call it with all the arguments and the syntax will work out.

How to specify a number range as a type in Idris?

I've been experimenting with Idris and it seems like it should be simple to specify some sort of type for representing all numbers between two different numbers, e.g. NumRange 5 10 is the type of all numbers between 5 and 10. I'd like to include doubles/floats, but a type for doing the same with integers would be equally useful. How would I go about doing this?
In practice, you may do better to simply check the bounds as needed, but you can certainly write a data type to enforce such a property.
One straightforward way to do it is like this:
data Range : Ord a => a -> a -> Type where
MkRange : Ord a => (x,y,z : a) -> (x >= y && (x <= z) = True) -> Range y z
I've written it generically over the Ord typeclass, though you may need to specialize it. The range requirement is expressed as an equation, so you simply supply a Refl when constructing it, and the property will then be checked. For example: MkRange 3 0 10 Refl : Range 0 10. One disadvantage of something like this is the inconvenience of having to extract the contained value. And of course if you want to construct an instance programmatically you'll need to supply the proofs that the bounds are indeed satisfied, or else do it in some context that allows for failure, like Maybe.
We can write a more elegant example for Nats without much trouble, since for them we already have a library data type to represent comparison proofs. In particular LTE, representing less-than-or-equal-to.
data InRange : Nat -> Nat -> Type where
IsInRange : (x : Nat) -> LTE n x -> LTE x m -> InRange n m
Now this data type nicely encapsulates a proof that n ≤ x ≤ m. It would be overkill for many casual applications, but it certainly shows how you might use dependent types for this purpose.

How to deal with this error?

I'm dealing with very long lists, and large trees.
Sometimes I would find this error:
surgery a;;
Characters 8-9:
surgery a;;
^
Error: This expression has type int t/1044
but an expression was expected of type 'a t/1810
# type 'a t = | Leaf of ('a -> 'a -> int)
| Node of 'a * 'a t * 'a t * ('a -> 'a -> int)
I'm not sure about what type is that kind of error, but I guess it's some kind of an overflow. The type matches correctly but there are large numbers after the backslash that follows the type. In this case 1044 and 1810.
This time I have run some code before surgery a. If I kill the current top-level and start over, surgery a would run.
My questions are:
1. What is this error exactly?
2. When and how does it occur?
3. Why rerunning it from a new top-level would make it work?
4. How should I deal with it?
This is a type error, not a runtime error. It does not "cost" anything and is not in any way related to the size of the structures you have in memory.
It happens if you're not careful in the toplevel, and mix two different types with the same name. Compare:
type t = int;;
let f (x : t) = ();;
type u = bool;;
let g (y : u) = f y;;
^
Error: This expression has type u = bool
but an expression was expected of type t = int
with
type t = int;;
let f (x : t) = ();;
type t = bool;;
let g (y : t) = f y;;
^
Error: This expression has type t/1047 = bool
but an expression was expected of type t/1044 = int
This is the exact same typing error happening in both cases: you mixed different types. But in the second case, both have the same name t. The type-system tries to be helpful and tells you about the unique integers it internally assign to names, to make sure there are really unique throughout the program.
This kind of error cannot happen outside the toplevel (when compiling a program the usual way), as it is not possible to define two types with the same name at the exact same path.
How to fix it: if you redefine a type with a new definition that is not equivalent to the previous one, you must be careful to also redefine the operations on this previous type previously recorded in the toplevel. Indeed, they are still typed as expecting the old type, and using them with the new type will result in such errors.

First and follow of the non-terminals in two grammars

Given the following grammar:
S -> L=L
s -> L
L -> *L
L -> id
What are the first and follow for the non-terminals?
If the grammar is changed into:
S -> L=R
S -> R
L -> *R
L -> id
R -> L
What will be the first and follow ?
When I took a compiler course in college I didn't understand FIRST and FOLLOWS at all. I implemented the algorithms described in the Dragon book, but I had no clue what was going on. I think I do now.
I assume you have some book that gives a formal definition of these two sets, and the book is completely incomprehensible. I'll try to give an informal description of them, and hopefully that will help you make sense of what's in your book.
The FIRST set is the set of terminals you could possibly see as the first part of the expansion of a non-terminal. The FOLLOWS set is the set of terminals you could possibly see following the expansion of a non-terminal.
In your first grammar, there are only three kinds of terminals: =, *, and id. (You might also consider $, the end-of-input symbol, to be a terminal.) The only non-terminals are S (a statement) and L (an Lvalue -- a "thing" you can assign to).
Think of FIRST(S) as the set of non-terminals that could possibly start a statement. Intuitively, you know you do not start a statement with =. So you wouldn't expect that to show up in FIRST(S).
So how does a statement start? There are two production rules that define what an S looks like, and they both start with L. So to figure out what's in FIRST(S), you really have to look at what's in FIRST(L). There are two production rules that define what an Lvalue looks like: it either starts with a * or with an id. So FIRST(S) = FIRST(L) = { *, id }.
FOLLOWS(S) is easy. Nothing follows S because it is the start symbol. So the only thing in FOLLOWS(S) is $, the end-of-input symbol.
FOLLOWS(L) is a little trickier. You have to look at every production rule where L appears, and see what comes after it. In the first rule, you see that = may follow L. So = is in FOLLOWS(L). But you also notice in that rule that there is another L at the end of the production rule. So another thing that could follow L is anything that could follow that production. We already figured out that the only thing that can follow the S production is the end-of-input. So FOLLOWS(L) = { =, $ }. (If you look at the other production rules, L always appears at the end of them, so you just get $ from those.)
Take a look at this Easy Explanation, and for now ignore all the stuff about ϵ, because you don't have any productions which contain the empty-string. Under "Rules for First Sets", rules #1, #3, and #4.1 should make sense. Under "Rules for Follows Sets", rules #1, #2, and #3 should make sense.
Things get more complicated when you have ϵ in your production rules. Suppose you have something like this:
D -> S C T id = V // Declaration is [Static] [Const] Type id = Value
S -> static | ϵ // The 'static' keyword is optional
C -> const | ϵ // The 'const' keyword is optional
T -> int | float // The Type is mandatory and is either 'int' or 'float'
V -> ... // The Value gets complicated, not important here.
Now if you want to compute FIRST(D) you can't just look at FIRST(S), because S may be "empty". You know intuitively that FIRST(D) is { static, const, int, float }. That intuition is codified in rule #4.2. Think of SCT in this example as Y1Y2Y3 in the "Easy Explanation" rules.
If you want to compute FOLLOWS(S), you can't just look at FIRST(C), because that may be empty, so you also have to look at FIRST(T). So FOLLOWS(S) = { const, int, float }. You get that by applying "Rules for follow sets" #2 and #4 (more or less).
I hope that helps and that you can figure out FIRST and FOLLOWS for the second grammar on your own.
If it helps, R represents an Rvalue -- a "thing" you can't assign to, such as a constant or a literal. An Lvalue can also act as an Rvalue (but not the other way around).
a = 2; // a is an lvalue, 2 is an rvalue
a = b; // a is an lvalue, b is an lvalue, but in this context it's an rvalue
2 = a; // invalid because 2 cannot be an lvalue
2 = 3; // invalid, same reason.
*4 = b; // Valid! You would almost never write code like this, but it is
// grammatically correct: dereferencing an Rvalue gives you an Lvalue.