Preventing FsCheck from generating NaN and infinities - testing

I have a deeply nested datastructure with floats all over the place.
I'm using FsCheck to check if the data is unchanged after serializing and then deserializing.
This property fails, when a float is either NaN or +/- infinity, however, such a case doesn't interest me, since I don't expect these values to occur in the actual data.
Is there a way to prevent FsCheck from generating NaN and infinities?
I have tried discarding generated data that contains said values, but this makes the test incredibly slow, so slow in fact, that the test is still running while I'm writing this, and I have my doubts it will actually finish...

For reflectively generated types that contain floats (as I suspect you're using) you can overwrite the default generator for floats by writing a class as follows:
type Overrides() =
static member Float() =
Arb.Default.Float()
|> filter (fun f -> not <| System.Double.IsNaN(f) &&
not <| System.Double.IsInfinity(f))
And then calling:
Arb.register<Overrides>()
Before FsCheck tries to generate the types; e.g. in your test setup or before calling Check.Quick.
You can check the result of the register method to see how it merged the default arbitrary instances with the new ones; it should have overridden them.
If you are using the xUnit extension you can avoid calling the Arb.register by using the Arbitraries argument of PropertyAttribute:
[<Property(Arbitraries=Overides)>]

As Mauricio Scheffer said, you can use NormalFloat type in test parameter.
Simple example for list of floats:
open FsCheck
let f (x : float list) = x |> List.map id
let propFloat (x : float list) = x = (f x)
let propNormalFloat (xn : NormalFloat list) =
let x = xn |> List.map NormalFloat.get
x = f x
Check.Quick propFloat
//Falsifiable, after 18 tests (13 shrinks) (StdGen (761688149,295892075)):
//[nan]
Check.Quick propNormalFloat
//Ok, passed 100 tests.

Related

How to define a polymorphic `Vector` type in Lean 4 as a `Subtype` of `List`?

The List type in Lean 4's prelude has a lot of nice goodies implemented, e.g. List.map, List.join, etc.
A classic example in dependently-typed languages is Vector a n where a is the type of the elements of the container, and n is the length. This allows you to do nice things like write a function concat (u : Vector a m) (v : Vector a n) : Vector a (m + n) whose type signature guarantees that the function returns a vector of the expected length.
Following a comment on a previous answer I'd like to write a Vector type as a Subtype of List, as a first step towards getting these methods for a sized container. But I'm new to Lean, and I don't exactly understand how this works. Subtype takes a predicate a -> Prop, which appears to work a bit like a filter on the parent type, which in this case is a := List. But all lists are valid vectors, so it doesn't seem like this is much use. The predicate should always return true, no?
My question is:
How can I use Subtype to get a subtype of List whose length is encoded in its type? And as a follow-up, how can I construct an element of this type from an existing List?
You're right that all lists are valid vectors, but the predicate is exactly how to specify that the Vector has a certain number of elements.
def Vec (n : Nat) (a : Type) := { ls : List a // ls.length = n }
def myVec : Vec 3 Nat := ⟨[1, 2, 3], rfl⟩

Where is the Idris == operator useful?

As a beginner in type-driven programming, I'm curious about the use of the == operator. Examples demonstrate that it's not sufficient to prove equality between two values of a certain type, and special equality checking types are introduced for the particular data types. In that case, where is == useful at all?
(==) (as the single constituent function of the Eq interface) is a function from a type T to Bool, and is good for equational reasoning. Whereas x = y (where x : T and y : T) AKA "intensional equality" is itself a type and therefore a proposition. You can and often will want to bounce back and forth between the two different ways of expressing equality for a particular type.
x == y = True is also a proposition, and is often an intermediate step between reasoning about (==) and reasoning about =.
The exact relationship between the two types of equality is rather complex, and you can read https://github.com/pdorrell/learning-idris/blob/9d3454a77f6e21cd476bd17c0bfd2a8a41f382b7/finished/EqFromEquality.idr for my own attempt to understand some aspects of it. (One thing to note is that even though an inductively defined type will have decideable intensional equality, you still have to go through a few hoops to prove that, and a few more hoops to define a corresponding implementation of Eq.)
One particular handy code snippet is this:
-- for rel x y, provide both the computed value, and the proposition that it is equal to the value (as a dependent pair)
has_value_dpair : (rel : t -> t -> Bool) -> (x : t) -> (y : t) -> (value: Bool ** rel x y = value)
has_value_dpair rel x y = (rel x y ** Refl)
You can use it with the with construct when you have a value returned from rel x y and you want to reason about the proposition rel x y = True or rel x y = False (and rel is some function that might represent a notion of equality between x and y).
(In this answer I assume the case where (==) corresponds to =, but you are entirely free to define a (==) function that doesn't correspond to =, eg when defining a Setoid. So that's another reason to use (==) instead of =.)
You still need good old equality because sometimes you can't prove things. Sometimes you don't even need to prove. Consider next example:
countEquals : Eq a => a -> List a -> Nat
countEquals x = length . filter (== x)
You might want to just count number of equal elements to show some statistics to user. Another example: tests. Yes, even with strong type system and dependent types you might want to perform good old unit tests. So you want to check for expectations and this is rather convenient to do with (==) operator.
I'm not going to write full list of cases where you might need (==). Equality operator is not enough for proving but you don't always need proofs.

How to modify a List element at a given index

What I've done (with some help from a friend) is create a function that takes a List, Int for the index, and a function to be applied to the element at the specified index. It's similar to Map but instead of applying a function to every element, it applies it to only one element.
So my questions are:
Does this function already exist in the core somewhere? We couldn't find it.
If not, is there a better way of accomplishing this than how we have done it?
Here's the code:
import Html exposing (text)
main =
let
m = {arr=[1,5,3], msg=""}
in
text (toString (getDisplay m 4 (\x -> x + 5)))
type alias Model =
{ arr : List (Int)
, msg : String
}
getDisplay : Model -> Int -> (Int -> Int) -> Model
getDisplay model i f =
let
m = (changeAt model.arr i f)
in
case m of
Ok val ->
{model | arr = val, msg = ""}
Err err ->
{model | arr = [], msg = err}
changeAt : List a -> Int -> (a -> a) -> Result String (List a)
changeAt l i func =
let
f j x = if j==i then func x else x
in
if i < (List.length l) && i >= 0 then
Ok(List.indexedMap f l)
else
Err "Bad index"
NOTE: Elm discourages indexing Lists, as they are linked lists under the hood: to retrieve the 1001th element, you have to first visit all 1000 previous elements. Nonetheless, if you wanted to do it, this is one way.
List.indexedMap is a good way to do what you're describing.
However, since you mention the downside of having to visit all preceding elements in a list, the reality in your example is actually a little worse, if indeed you are super worried about performance.
Your list is actually traversed fully at least two times, regardless of whether the index exists or not. The simple act of asking for the length of a linked list has to traverse the entire list. Check out the source code, length is implemented in terms of a foldl.
Furthermore, List.indexedMap traverses the entire list at least once. I say, at least once, since the source of indexedMap also calls the length function in addition to using map. If we're lucky, the length call is memoized (I'm not familiar enough with Elm internals to know whether it is or not, hence the at least comment). The map itself traverses the entire list when called, unlike Haskell which evaluates things lazily, only as much as necessary.
And if you use indexedMap, the whole list is indexed regardless of the position you are interested in. That is, even if you want to apply the function at index zero, the entire list is indexed.
If you actually want to reduce the number of traversals to a minimum, you're going to (at this time) have to implement your own function and you'll have to do it without relying on length or indexedMap.
Here is an example of a changeAt function which avoids unnecessary traversals and if it finds the position, it stops traversing the list.
changeAt : List a -> Int -> (a -> a) -> Result String (List a)
changeAt l i func =
if i < 0 then
Err "Bad Index"
else
case l of
[] ->
Err "Not found"
(x::xs) ->
if i == 0 then
Ok <| func x :: xs
else
Result.map ((::) x) <| changeAt xs (i - 1) func
It's not terribly pretty, but if you want to avoid unnecessarily walking through the list - multiple times - then you might want to go with something like this.
You're looking for the set function for Arrays. Instead of using a List, which is inefficient as you described, this structure is better suited to your use case.
Here's an efficient implementation of the function you're looking for:
changeAt : Int -> (a -> a) -> Array a -> Array a
changeAt i f array =
case get i array of
Just item ->
set i (f item) array
Nothing ->
array
Also note that the data structure is the last argument in this implementation.
Array is mentioned in the link in your question, but nobody on this thread had explicitly mentioned this option yet.

How to "read" Elm's -> operator

I'm really loving Elm, up to the point where I encounter a function I've never seen before and want to understand its inputs and outputs.
Take the declaration of foldl for example:
foldl : (a -> b -> b) -> b -> List a -> b
I look at this and can't help feeling as if there's a set of parentheses that I'm missing, or some other subtlety about the associativity of this operator (for which I can't find any explicit documentation). Perhaps it's just a matter of using the language more until I just get a "feel" for it, but I'd like to think there's a way to "read" this definition in English.
Looking at the example from the docs…
foldl (::) [] [1,2,3] == [3,2,1]
I expect the function signature to read something like this:
Given a function that takes an a and a b and returns a b, an additional b, and a List, foldl returns a b.
Is that correct?
What advice can you give to someone like me who desperately wants inputs to be comma-delineated and inputs/outputs to be separated more clearly?
Short answer
The missing parentheses you're looking for are due to the fact that -> is right-associative: the type (a -> b -> b) -> b -> List a -> b is equivalent to (a -> b -> b) -> (b -> (List a -> b)). Informally, in a chain of ->s, read everything before the last -> as an argument and only the rightmost thing as a result.
Long answer
The key insight you may be missing is currying -- the idea that if you have a function that takes two arguments, you can represent it with a function that takes the first argument and returns a function that takes the second argument and then returns the result.
For instance, suppose you have a function add that takes two integers and adds them together. In Elm, you could write a function that takes both elements as a tuple and adds them:
add : (Int, Int) -> Int
add (x, y) = x+y
and you could call it as
add (1, 2) -- evaluates to 3
But suppose you didn't have tuples. You might think that there would be no way to write this function, but in fact using currying you could write it as:
add : Int -> (Int -> Int)
add x =
let addx : Int -> Int
addx y = x+y
in
addx
That is, you write a function that takes x and returns another function that takes y and adds it to the original x. You could call it with
((add 1) 2) -- evaluates to 3
You can now think of add in two ways: either as a function that takes an x and a y and adds them, or as a "factory" function that takes x values and produces new, specialized addx functions that take just one argument and add it to x.
The "factory" way of thinking about things comes in handy every once in a while. For instance, if you have a list of numbers called numbers and you want to add 3 to each number, you can just call List.map (add 3) numbers; if you'd written the tuple version instead you'd have to write something like List.map (\y -> add (3,y)) numbers which is a bit more awkward.
Elm comes from a tradition of programming languages that really like this way of thinking about functions and encourage it where possible, so Elm's syntax for functions is designed to make it easy. To that end, -> is right-associative: a -> b -> c is equivalent to a -> (b -> c). This means if you don't parenthesize, what you're defining is a function that takes an a and returns a b -> c, which again we can think of either as a function that takes an a and a b and returns a c, or equivalently a function that takes an a and returns a b -> c.
There's another syntactic nicety that helps call these functions: function application is left-associative. That way, the ugly ((add 1) 2) from above can be written as add 1 2. With that syntax tweak, you don't have to think about currying at all unless you want to partially apply a function -- just call it with all the arguments and the syntax will work out.

Test.QuickCheck: speed up testing multiple properties for the same type

I am testing a random generator generating instances of my own type. For that I have a custom instance of Arbitrary:
complexGenerator :: (RandomGen g) => g -> (MyType, g)
instance Arbitrary MyType where
arbitrary = liftM (fst . complexGenerator . mkStdGen) arbitrary
This works well with Test.QuickCheck (actually, Test.Framework) for testing that the generated values hold certain properties. However, there are quite a few properties I want to check, and the more I add, the more time it takes to verify them all.
Is there a way to use the same generated values for testing every property, instead of generating them anew each time? I obviously still want to see, on failures, which property did not hold, so making one giant property with and is not optimal.
I obviously still want to see, on failures, which property did not hold, so making one giant property with and is not optimal.
You could label each property using printTestCase before making a giant property with conjoin.
e.g. you were thinking this would be a bad idea:
prop_giant :: MyType -> Bool
prop_giant x = and [prop_one x, prop_two x, prop_three x]
this would be as efficient yet give you better output:
prop_giant :: MyType -> Property
prop_giant x = conjoin [printTestCase "one" $ prop_one x,
printTestCase "two" $ prop_two x,
printTestCase "three" $ prop_three x]
(Having said that, I've never used this method myself and am only assuming it will work; conjoin is probably marked as experimental in the documentation for a reason.)
In combination with the voted answer, what I've found helpful is using a Reader transformer with the Writer monad:
type Predicate r = ReaderT r (Writer String) Bool
The Reader "shared environment" is the tested input in this case. Then you can compose properties like this:
inv_even :: Predicate Int
inv_even = do
lift . tell $ "this is the even invariant"
(==) 0 . flip mod 2 <$> ask
toLabeledProp :: r -> Predicate r -> Property
toLabeledProp cause r =
let (effect, msg) = runWriter . (runReaderT r) $ cause in
printTestCase ("inv: " ++ msg) . property $ effect
and combining:
fromPredicates :: [Predicate r] -> r -> Property
fromPredicates predicates cause =
conjoin . map (toLabeledProp cause) $ predicates
I suspect there is another approach involving something similar to Either or a WriterT here- which would concisely compose predicates on different types into one result. But at the least, this allows for documenting properties which impose different post-conditions dependent on the the value of the input.
Edit: This idea spawned a library:
http://github.com/jfeltz/quickcheck-property-comb