Example of a `Type 1` that is neither `Type` nor an inhabitant of `Type` - dependent-type

What is an example of an inhabitant of Type 1 that is neither Type nor an inhabitant of Type? I wasn't able to come up with anything while exploring in the Idris REPL.
To be more precise, I'm looking for some x other than Type that yields the following:
Idris> :t x
x : Type 1

I'm not a type theory specialist, but here is my understanding. In the Idris tutorial there is a section 12.8 Cumulativity. It says that there is an internal hierarchy of type universes:
Type : Type 1 : Type 2 : Type 3 : ...
And for any x : Type n implies x : Type m for any m > n. There is also an example demonstrating how it prevents possible cycles in the type inference.
But I think that this hierarchy is only for internal use and there is no possibility to create a value of Type (n+1) which is not in Type n.
See also articles in nLab about universes in type theory and about type of types.
Maybe this issue in the idris-dev repository can be useful too. Edwin Brady refers there to the Design and Implementation paper (see section 3.2.2).

I'm not an Idris expert, but I'd expect that
Type -> Type
is also in Type 1.
I'd also expect
Nat -> Type
and if you're very lucky (not so sure about this one)
List Type
to be that large.
The idea is that you can do all type-building operations at every level. Each time you use types from one level as values, the types of those structures live one level up.

Related

Where is the Idris == operator useful?

As a beginner in type-driven programming, I'm curious about the use of the == operator. Examples demonstrate that it's not sufficient to prove equality between two values of a certain type, and special equality checking types are introduced for the particular data types. In that case, where is == useful at all?
(==) (as the single constituent function of the Eq interface) is a function from a type T to Bool, and is good for equational reasoning. Whereas x = y (where x : T and y : T) AKA "intensional equality" is itself a type and therefore a proposition. You can and often will want to bounce back and forth between the two different ways of expressing equality for a particular type.
x == y = True is also a proposition, and is often an intermediate step between reasoning about (==) and reasoning about =.
The exact relationship between the two types of equality is rather complex, and you can read https://github.com/pdorrell/learning-idris/blob/9d3454a77f6e21cd476bd17c0bfd2a8a41f382b7/finished/EqFromEquality.idr for my own attempt to understand some aspects of it. (One thing to note is that even though an inductively defined type will have decideable intensional equality, you still have to go through a few hoops to prove that, and a few more hoops to define a corresponding implementation of Eq.)
One particular handy code snippet is this:
-- for rel x y, provide both the computed value, and the proposition that it is equal to the value (as a dependent pair)
has_value_dpair : (rel : t -> t -> Bool) -> (x : t) -> (y : t) -> (value: Bool ** rel x y = value)
has_value_dpair rel x y = (rel x y ** Refl)
You can use it with the with construct when you have a value returned from rel x y and you want to reason about the proposition rel x y = True or rel x y = False (and rel is some function that might represent a notion of equality between x and y).
(In this answer I assume the case where (==) corresponds to =, but you are entirely free to define a (==) function that doesn't correspond to =, eg when defining a Setoid. So that's another reason to use (==) instead of =.)
You still need good old equality because sometimes you can't prove things. Sometimes you don't even need to prove. Consider next example:
countEquals : Eq a => a -> List a -> Nat
countEquals x = length . filter (== x)
You might want to just count number of equal elements to show some statistics to user. Another example: tests. Yes, even with strong type system and dependent types you might want to perform good old unit tests. So you want to check for expectations and this is rather convenient to do with (==) operator.
I'm not going to write full list of cases where you might need (==). Equality operator is not enough for proving but you don't always need proofs.

Why the word 'dereferencing' in terms of dereferencing pointers? [duplicate]

This question will draw information from the draft N1570, so C11 basically.
Colloquially, to dereference a pointer means to apply the unary * operator to a pointer. There is only one place where the word "dereferencing" exists in the draft document (no instance of "dereference"), and it is in a footnote:
102) [...]
Among the invalid values for dereferencing a pointer by the unary *
operator are a null pointer, an address inappropriately aligned for
the type of object pointed to, and the address of an object after the
end of its lifetime
As far as I can see, the unary * operator is actually called the "indirection operator", as evidenced by §6.5.3.2:
6.5.3.2 Address and indirection operators
4 The unary * operator denotes indirection. [...]
Simiarily, it is explicitly called the indirection operator in Annex §J.2:
— The value of an object is accessed by an array-subscript [],
member-access . or −>, address &, or indirection * operator or a
pointer cast in creating an address constant (6.6).
So is it correct to talk about "dereferencing pointers" in C or is this being excessively pedantic? Where does the terminology come from? (I can kinda give a pass on [] being called "deferencing" due to §6.5.2.1)
K&R v1
If one look at The C Programming Language, in first edition, (1978), the term “indirection” is used.
Examples
2.12 Precedence and Order of Evaluation
[…]
Chapter 5 discusses * (indirection) and & (address of).
,
7.2 Unary operators
[…]
The unary * operator means indirection: the expression must be a pointer, and the
result is an lvalue referring to the object to which the expression points.
It is also listed in INDEX as e.g.
* indirection operator 89, 187
A longer excerpt from section 5.1
5.1 Pointers and Addresses
      Since a pointer contains the address of an object, it is possible to access the object “indirectly” through the pointer.
Suppose that x is a variable, say an int, and that px is a
pointer, created in some as yet unspecified way. The unary operator c
gives the address of an object, so the statement
px = &x;
assigns the address of x to the variable px; px is now said to
“point to” x. The & operator can be applied only to variables
and array elements; constructs like &(x+1 ) and &3 are illegal. It
is also illegal to take the address of a register variable.
    The unary operator * treats its operand as the address off the ultimate target, and accesses that address to fetch the contents. Thus
if y is alos an int,
y = *px;
assigns to y the contents of whatever px points to. So the
sequence
px = &x;
y = *px;
assigns the same value to y as does
y = x;
K&R v2
In second edition the term dereferencing comes in.
5.1 Pointers and Addresses
The unary operator * is the indirection or dereferencing operator; when applied to a pointer, it accesses the object the pointer points to. Suppose that x and y are integers and ip is a pointer to int. This artificial sequence shows how to declare a pointer and how to use & and *:
[…]
Prior usage
The term is however ("much") older as can be seen in e.g.
A survey of some issues concerning abstract data types, 1974. E.g pp24/25. Here stated in the connection with ALGOL 68, PASCAL, SIMULA 67.
The mechanism by which pointers are transformed into values by a language is
known as 'dereferencing', a form of coercion (discussed later). Consider the statement
p := q;
Depending upon the types of p and q, there are several possible interpretations.
Let '#' be a dereferencing operator (i.e. if p points to j , then #p is the same as j) and
'#' be a referencing operation (i.e. if p points to j , then p is the same as #j). The
following table indicates the possible actions a language might take to perform the
assignment:
|
| type of p
|
| t ref t ref ref t . . .
|
---------------------------------------------------------
|
t | p←q p←#q p←##q
| #p←q #p←#q
| ##p←q
type |
of |
q ref t | p←#q p←q p←#q
| #p←#q #p←q
| ##p←#q
|
|
ref ref t | p←##q p←#q p←q
. | #p←##q #p←#q
. | ##p←##q
. |
|
|
[…]
Coining
There are several other examples of its usage. Exactly where and when it was coined I am not able to find though (at least not yet). (The 1974 paper is at least interesting.)
For the fun of it it can also often be useful to look at mailing lists such as net.unix-wizards. An example from Peter Lamb at Melbourne Uni (11/28/83):
Dereferencing NULL pointers is yet another example of idiots who
write 'portable' code, assuming however, that THEIR machine is the
only one on which it will ever run: the same sorts of people who designed
cpio with binary headers.
Even on a VAX, dereferencing NULL will get you garbage: sure, *(char *)NULL
and *(short *)NULL return you 0, but *(int *)NULL will give you
1024528128 !!!!.
[…]
Ed1. Addition
Not mentioning “dereferencing” but still; An interesting read is Ritchie: The Development of the C Language ✝
Here the term “indirection” is also consistently used – but/and/etc. the connection between the languages are somewhat detailed. The use of the term is thus interesting in view of e.g. papers like the 1974 one mentioned above.
As an example on indirection as concept and the syntax read e.g. pp 12 ev.
    An accident of syntax contributed to the perceived complexity of the language. The indirection operator, spelled * in C, is syntactically a unary prefix operator, just as in BCPL and B. This works well in simple expressions, but in more complex cases, parentheses are required to direct the parsing.
[…]
There are two effects occurring. Most important, C has a relatively rich set of ways of describing types (compared, say, with Pascal). Declarations in languages as expressive as C– Algol 68, for example – describe objects equally hard to understand, simply because the objects themselves are complex. A second effect owes to details of the syntax. Declarations in C must be read in an ‘inside-out’ style that many find difficult to grasp [Anderson 80].
In this conjunction it is likely also worth mentioning ANSI C89 and mentions like:
3.1.2.5 Types
A pointer to void may not be dereferenced, although such a pointer may be converted to a normal pointer type which may be dereferenced.
Draft ANSI C Standard (ANSI X3J11/88-090), (Courtesy of Wikipedia)
Rationale for American National Standard for Information Systems – Programming Language – C
Among the invalid values for dereferencing a pointer by the unary * operator are
a null pointer, an address inappropriately aligned for the type of
object pointed to, or the address of an object that has automatic
storage duration when execution of the block in which the object is
declared and of all enclosed blocks has terminated.
(I have to re-read some of these documents now.)
Because in the good old days of K&R C, the language only passed parameters by value. So pointers were used to simulate a pass parameters by reference. And people (incorrectly) spoke of taking a reference to a variable for constructing a pointer to a variable.
And the dereferencing of a pointer was the opposite operation.
Now C++ uses true references that are distinct from pointers, but the word dereference is still used (even if it is not really correct).
I do not know the exact etymology, but one can consider a pointer value (in the generic sense, not the C/C++-specific meaning) as "referencing" another object in memory; that is, p refers to x. When we use p to obtain the value stored in x, we are bypassing that reference, or de-referencing p.
Kernighan and Ritchie, The C Programming Language, 2nd ed., 5.1:
The unary operator * is the indirection or dereferencing operator; [...] ''pointer to void'' is used to hold any type of pointer but cannot be dereferenced itself.

How to specify a number range as a type in Idris?

I've been experimenting with Idris and it seems like it should be simple to specify some sort of type for representing all numbers between two different numbers, e.g. NumRange 5 10 is the type of all numbers between 5 and 10. I'd like to include doubles/floats, but a type for doing the same with integers would be equally useful. How would I go about doing this?
In practice, you may do better to simply check the bounds as needed, but you can certainly write a data type to enforce such a property.
One straightforward way to do it is like this:
data Range : Ord a => a -> a -> Type where
MkRange : Ord a => (x,y,z : a) -> (x >= y && (x <= z) = True) -> Range y z
I've written it generically over the Ord typeclass, though you may need to specialize it. The range requirement is expressed as an equation, so you simply supply a Refl when constructing it, and the property will then be checked. For example: MkRange 3 0 10 Refl : Range 0 10. One disadvantage of something like this is the inconvenience of having to extract the contained value. And of course if you want to construct an instance programmatically you'll need to supply the proofs that the bounds are indeed satisfied, or else do it in some context that allows for failure, like Maybe.
We can write a more elegant example for Nats without much trouble, since for them we already have a library data type to represent comparison proofs. In particular LTE, representing less-than-or-equal-to.
data InRange : Nat -> Nat -> Type where
IsInRange : (x : Nat) -> LTE n x -> LTE x m -> InRange n m
Now this data type nicely encapsulates a proof that n ≤ x ≤ m. It would be overkill for many casual applications, but it certainly shows how you might use dependent types for this purpose.

How to use SmallCheck in Haskell?

I am trying to use SmallCheck to test a Haskell program, but I cannot understand how to use the library to test my own data types. Apparently, I need to use the Test.SmallCheck.Series. However, I find the documentation for it extremely confusing. I am interested in both cookbook-style solutions and an understandable explanation of the logical (monadic?) structure. Here are some questions I have (all related):
If I have a data type data Person = SnowWhite | Dwarf Integer, how do I explain to smallCheck that the valid values are Dwarf 1 through Dwarf 7 (or SnowWhite)? What if I have a complicated FairyTale data structure and a constructor makeTale :: [Person] -> FairyTale, and I want smallCheck to make FairyTale-s from lists of Person-s using the constructor?
I managed to make quickCheck work like this without getting my hands too dirty by using judicious applications of Control.Monad.liftM to functions like makeTale. I couldn't figure out a way to do this with smallCheck (please explain it to me!).
What is the relationship between the types Serial, Series, etc.?
(optional) What is the point of coSeries? How do I use the Positive type from SmallCheck.Series?
(optional) Any elucidation of what is the logic behind what should be a monadic expression, and what is just a regular function, in the context of smallCheck, would be appreciated.
If there is there any intro/tutorial to using smallCheck, I'd appreciate a link. Thank you very much!
UPDATE: I should add that the most useful and readable documentation I found for smallCheck is this paper (PDF). I could not find the answer to my questions there on the first look; it is more of a persuasive advertisement than a tutorial.
UPDATE 2: I moved my question about the weird Identity that shows up in the type of Test.SmallCheck.list and other places to a separate question.
NOTE: This answer describes pre-1.0 versions of SmallCheck. See this blog post for the important differences between SmallCheck 0.6 and 1.0.
SmallCheck is like QuickCheck in that it tests a property over some part of the space of possible types. The difference is that it tries to exhaustively enumerate a series all of the "small" values instead of an arbitrary subset of smallish values.
As I hinted, SmallCheck's Serial is like QuickCheck's Arbitrary.
Now Serial is pretty simple: a Serial type a has a way (series) to generate a Series type which is just a function from Depth -> [a]. Or, to unpack that, Serial objects are objects we know how to enumerate some "small" values of. We are also given a Depth parameter which controls how many small values we should generate, but let's ignore it for a minute.
instance Serial Bool where series _ = [False, True]
instance Serial Char where series _ = "abcdefghijklmnopqrstuvwxyz"
instance Serial a => Serial (Maybe a) where
series d = Nothing : map Just (series d)
In these cases we're doing nothing more than ignoring the Depth parameter and then enumerating "all" possible values for each type. We can even do this automatically for some types
instance (Enum a, Bounded a) => Serial a where series _ = [minBound .. maxBound]
This is a really simple way of testing properties exhaustively—literally test every single possible input! Obviously there are at least two major pitfalls, though: (1) infinite data types will lead to infinite loops when testing and (2) nested types lead to exponentially larger spaces of examples to look through. In both cases, SmallCheck gets really large really quickly.
So that's the point of the Depth parameter—it lets the system ask us to keep our Series small. From the documentation, Depth is the
Maximum depth of generated test values
For data values, it is the depth of nested constructor applications.
For functional values, it is both the depth of nested case analysis and the depth of results.
so let's rework our examples to keep them Small.
instance Serial Bool where
series 0 = []
series 1 = [False]
series _ = [False, True]
instance Serial Char where
series d = take d "abcdefghijklmnopqrstuvwxyz"
instance Serial a => Serial (Maybe a) where
-- we shrink d by one since we're adding Nothing
series d = Nothing : map Just (series (d-1))
instance (Enum a, Bounded a) => Serial a where series d = take d [minBound .. maxBound]
Much better.
So what's coseries? Like coarbitrary in the Arbitrary typeclass of QuickCheck, it lets us build a series of "small" functions. Note that we're writing the instance over the input type---the result type is handed to us in another Serial argument (that I'm below calling results).
instance Serial Bool where
coseries results d = [\cond -> if cond then r1 else r2 |
r1 <- results d
r2 <- results d]
these take a little more ingenuity to write and I'll actually refer you to use the alts methods which I'll describe briefly below.
So how can we make some Series of Persons? This part is easy
instance Series Person where
series d = SnowWhite : take (d-1) (map Dwarf [1..7])
...
But our coseries function needs to generate every possible function from Persons to something else. This can be done using the altsN series of functions provided by SmallCheck. Here's one way to write it
coseries results d = [\person ->
case person of
SnowWhite -> f 0
Dwarf n -> f n
| f <- alts1 results d ]
The basic idea is that altsN results generates a Series of N-ary function from N values with Serial instances to the Serial instance of Results. So we use it to create a function from [0..7], a previously defined Serial value, to whatever we need, then we map our Persons to numbers and pass 'em in.
So now that we have a Serial instance for Person, we can use it to build more complex nested Serial instances. For "instance", if FairyTale is a list of Persons, we can use the Serial a => Serial [a] instance alongside our Serial Person instance to easily create a Serial FairyTale:
instance Serial FairyTale where
series = map makeFairyTale . series
coseries results = map (makeFairyTale .) . coseries results
(the (makeFairyTale .) composes makeFairyTale with each function coseries generates, which is a little confusing)
If I have a data type data Person = SnowWhite | Dwarf Integer, how do I explain to smallCheck that the valid values are Dwarf 1 through Dwarf 7 (or SnowWhite)?
First of all, you need to decide which values you want to generate for each depth. There's no single right answer here, it depends on how fine-grained you want your search space to be.
Here are just two possible options:
people d = SnowWhite : map Dwarf [1..7] (doesn't depend on the depth)
people d = take d $ SnowWhite : map Dwarf [1..7] (each unit of depth increases the search space by one element)
After you've decided on that, your Serial instance is as simple as
instance Serial m Person where
series = generate people
We left m polymorphic here as we don't require any specific structure of the underlying monad.
What if I have a complicated FairyTale data structure and a constructor makeTale :: [Person] -> FairyTale, and I want smallCheck to make FairyTale-s from lists of Person-s using the constructor?
Use cons1:
instance Serial m FairyTale where
series = cons1 makeTale
What is the relationship between the types Serial, Series, etc.?
Serial is a type class; Series is a type. You can have multiple Series of the same type — they correspond to different ways to enumerate values of that type. However, it may be arduous to specify for each value how it should be generated. The Serial class lets us specify a good default for generating values of a particular type.
The definition of Serial is
class Monad m => Serial m a where
series :: Series m a
So all it does is assigning a particular Series m a to a given combination of m and a.
What is the point of coseries?
It is needed to generate values of functional types.
How do I use the Positive type from SmallCheck.Series?
For example, like this:
> smallCheck 10 $ \n -> n^3 >= (n :: Integer)
Failed test no. 5.
there exists -2 such that
condition is false
> smallCheck 10 $ \(Positive n) -> n^3 >= (n :: Integer)
Completed 10 tests without failure.
Any elucidation of what is the logic behind what should be a monadic expression, and what is just a regular function, in the context of smallCheck, would be appreciated.
When you are writing a Serial instance (or any Series expression), you work in the Series m monad.
When you are writing tests, you work with simple functions that return Bool or Property m.
While I think that #tel's answer is an excellent explanation (and I wish smallCheck actually worked the way he describes), the code he provides does not work for me (with smallCheck version 1). I managed to get the following to work...
UPDATE / WARNING: The code below is wrong for a rather subtle reason. For the corrected version, and details, please see this answer to the question mentioned below. The short version is that instead of instance Serial Identity Person one must write instance (Monad m) => Series m Person.
... but I find the use of Control.Monad.Identity and all the compiler flags bizarre, and I have asked a separate question about that.
Note also that while Series Person (or actually Series Identity Person) is not actually exactly the same as functions Depth -> [Person] (see #tel's answer), the function generate :: Depth -> [a] -> Series m a converts between them.
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses, FlexibleContexts, UndecidableInstances #-}
import Test.SmallCheck
import Test.SmallCheck.Series
import Control.Monad.Identity
data Person = SnowWhite | Dwarf Int
instance Serial Identity Person where
series = generate (\d -> SnowWhite : take (d-1) (map Dwarf [1..7]))

How to deal with this error?

I'm dealing with very long lists, and large trees.
Sometimes I would find this error:
surgery a;;
Characters 8-9:
surgery a;;
^
Error: This expression has type int t/1044
but an expression was expected of type 'a t/1810
# type 'a t = | Leaf of ('a -> 'a -> int)
| Node of 'a * 'a t * 'a t * ('a -> 'a -> int)
I'm not sure about what type is that kind of error, but I guess it's some kind of an overflow. The type matches correctly but there are large numbers after the backslash that follows the type. In this case 1044 and 1810.
This time I have run some code before surgery a. If I kill the current top-level and start over, surgery a would run.
My questions are:
1. What is this error exactly?
2. When and how does it occur?
3. Why rerunning it from a new top-level would make it work?
4. How should I deal with it?
This is a type error, not a runtime error. It does not "cost" anything and is not in any way related to the size of the structures you have in memory.
It happens if you're not careful in the toplevel, and mix two different types with the same name. Compare:
type t = int;;
let f (x : t) = ();;
type u = bool;;
let g (y : u) = f y;;
^
Error: This expression has type u = bool
but an expression was expected of type t = int
with
type t = int;;
let f (x : t) = ();;
type t = bool;;
let g (y : t) = f y;;
^
Error: This expression has type t/1047 = bool
but an expression was expected of type t/1044 = int
This is the exact same typing error happening in both cases: you mixed different types. But in the second case, both have the same name t. The type-system tries to be helpful and tells you about the unique integers it internally assign to names, to make sure there are really unique throughout the program.
This kind of error cannot happen outside the toplevel (when compiling a program the usual way), as it is not possible to define two types with the same name at the exact same path.
How to fix it: if you redefine a type with a new definition that is not equivalent to the previous one, you must be careful to also redefine the operations on this previous type previously recorded in the toplevel. Indeed, they are still typed as expecting the old type, and using them with the new type will result in such errors.