Test.QuickCheck: speed up testing multiple properties for the same type - testing

I am testing a random generator generating instances of my own type. For that I have a custom instance of Arbitrary:
complexGenerator :: (RandomGen g) => g -> (MyType, g)
instance Arbitrary MyType where
arbitrary = liftM (fst . complexGenerator . mkStdGen) arbitrary
This works well with Test.QuickCheck (actually, Test.Framework) for testing that the generated values hold certain properties. However, there are quite a few properties I want to check, and the more I add, the more time it takes to verify them all.
Is there a way to use the same generated values for testing every property, instead of generating them anew each time? I obviously still want to see, on failures, which property did not hold, so making one giant property with and is not optimal.

I obviously still want to see, on failures, which property did not hold, so making one giant property with and is not optimal.
You could label each property using printTestCase before making a giant property with conjoin.
e.g. you were thinking this would be a bad idea:
prop_giant :: MyType -> Bool
prop_giant x = and [prop_one x, prop_two x, prop_three x]
this would be as efficient yet give you better output:
prop_giant :: MyType -> Property
prop_giant x = conjoin [printTestCase "one" $ prop_one x,
printTestCase "two" $ prop_two x,
printTestCase "three" $ prop_three x]
(Having said that, I've never used this method myself and am only assuming it will work; conjoin is probably marked as experimental in the documentation for a reason.)

In combination with the voted answer, what I've found helpful is using a Reader transformer with the Writer monad:
type Predicate r = ReaderT r (Writer String) Bool
The Reader "shared environment" is the tested input in this case. Then you can compose properties like this:
inv_even :: Predicate Int
inv_even = do
lift . tell $ "this is the even invariant"
(==) 0 . flip mod 2 <$> ask
toLabeledProp :: r -> Predicate r -> Property
toLabeledProp cause r =
let (effect, msg) = runWriter . (runReaderT r) $ cause in
printTestCase ("inv: " ++ msg) . property $ effect
and combining:
fromPredicates :: [Predicate r] -> r -> Property
fromPredicates predicates cause =
conjoin . map (toLabeledProp cause) $ predicates
I suspect there is another approach involving something similar to Either or a WriterT here- which would concisely compose predicates on different types into one result. But at the least, this allows for documenting properties which impose different post-conditions dependent on the the value of the input.
Edit: This idea spawned a library:
http://github.com/jfeltz/quickcheck-property-comb

Related

How to modify a List element at a given index

What I've done (with some help from a friend) is create a function that takes a List, Int for the index, and a function to be applied to the element at the specified index. It's similar to Map but instead of applying a function to every element, it applies it to only one element.
So my questions are:
Does this function already exist in the core somewhere? We couldn't find it.
If not, is there a better way of accomplishing this than how we have done it?
Here's the code:
import Html exposing (text)
main =
let
m = {arr=[1,5,3], msg=""}
in
text (toString (getDisplay m 4 (\x -> x + 5)))
type alias Model =
{ arr : List (Int)
, msg : String
}
getDisplay : Model -> Int -> (Int -> Int) -> Model
getDisplay model i f =
let
m = (changeAt model.arr i f)
in
case m of
Ok val ->
{model | arr = val, msg = ""}
Err err ->
{model | arr = [], msg = err}
changeAt : List a -> Int -> (a -> a) -> Result String (List a)
changeAt l i func =
let
f j x = if j==i then func x else x
in
if i < (List.length l) && i >= 0 then
Ok(List.indexedMap f l)
else
Err "Bad index"
NOTE: Elm discourages indexing Lists, as they are linked lists under the hood: to retrieve the 1001th element, you have to first visit all 1000 previous elements. Nonetheless, if you wanted to do it, this is one way.
List.indexedMap is a good way to do what you're describing.
However, since you mention the downside of having to visit all preceding elements in a list, the reality in your example is actually a little worse, if indeed you are super worried about performance.
Your list is actually traversed fully at least two times, regardless of whether the index exists or not. The simple act of asking for the length of a linked list has to traverse the entire list. Check out the source code, length is implemented in terms of a foldl.
Furthermore, List.indexedMap traverses the entire list at least once. I say, at least once, since the source of indexedMap also calls the length function in addition to using map. If we're lucky, the length call is memoized (I'm not familiar enough with Elm internals to know whether it is or not, hence the at least comment). The map itself traverses the entire list when called, unlike Haskell which evaluates things lazily, only as much as necessary.
And if you use indexedMap, the whole list is indexed regardless of the position you are interested in. That is, even if you want to apply the function at index zero, the entire list is indexed.
If you actually want to reduce the number of traversals to a minimum, you're going to (at this time) have to implement your own function and you'll have to do it without relying on length or indexedMap.
Here is an example of a changeAt function which avoids unnecessary traversals and if it finds the position, it stops traversing the list.
changeAt : List a -> Int -> (a -> a) -> Result String (List a)
changeAt l i func =
if i < 0 then
Err "Bad Index"
else
case l of
[] ->
Err "Not found"
(x::xs) ->
if i == 0 then
Ok <| func x :: xs
else
Result.map ((::) x) <| changeAt xs (i - 1) func
It's not terribly pretty, but if you want to avoid unnecessarily walking through the list - multiple times - then you might want to go with something like this.
You're looking for the set function for Arrays. Instead of using a List, which is inefficient as you described, this structure is better suited to your use case.
Here's an efficient implementation of the function you're looking for:
changeAt : Int -> (a -> a) -> Array a -> Array a
changeAt i f array =
case get i array of
Just item ->
set i (f item) array
Nothing ->
array
Also note that the data structure is the last argument in this implementation.
Array is mentioned in the link in your question, but nobody on this thread had explicitly mentioned this option yet.

Preventing FsCheck from generating NaN and infinities

I have a deeply nested datastructure with floats all over the place.
I'm using FsCheck to check if the data is unchanged after serializing and then deserializing.
This property fails, when a float is either NaN or +/- infinity, however, such a case doesn't interest me, since I don't expect these values to occur in the actual data.
Is there a way to prevent FsCheck from generating NaN and infinities?
I have tried discarding generated data that contains said values, but this makes the test incredibly slow, so slow in fact, that the test is still running while I'm writing this, and I have my doubts it will actually finish...
For reflectively generated types that contain floats (as I suspect you're using) you can overwrite the default generator for floats by writing a class as follows:
type Overrides() =
static member Float() =
Arb.Default.Float()
|> filter (fun f -> not <| System.Double.IsNaN(f) &&
not <| System.Double.IsInfinity(f))
And then calling:
Arb.register<Overrides>()
Before FsCheck tries to generate the types; e.g. in your test setup or before calling Check.Quick.
You can check the result of the register method to see how it merged the default arbitrary instances with the new ones; it should have overridden them.
If you are using the xUnit extension you can avoid calling the Arb.register by using the Arbitraries argument of PropertyAttribute:
[<Property(Arbitraries=Overides)>]
As Mauricio Scheffer said, you can use NormalFloat type in test parameter.
Simple example for list of floats:
open FsCheck
let f (x : float list) = x |> List.map id
let propFloat (x : float list) = x = (f x)
let propNormalFloat (xn : NormalFloat list) =
let x = xn |> List.map NormalFloat.get
x = f x
Check.Quick propFloat
//Falsifiable, after 18 tests (13 shrinks) (StdGen (761688149,295892075)):
//[nan]
Check.Quick propNormalFloat
//Ok, passed 100 tests.

How to use SmallCheck in Haskell?

I am trying to use SmallCheck to test a Haskell program, but I cannot understand how to use the library to test my own data types. Apparently, I need to use the Test.SmallCheck.Series. However, I find the documentation for it extremely confusing. I am interested in both cookbook-style solutions and an understandable explanation of the logical (monadic?) structure. Here are some questions I have (all related):
If I have a data type data Person = SnowWhite | Dwarf Integer, how do I explain to smallCheck that the valid values are Dwarf 1 through Dwarf 7 (or SnowWhite)? What if I have a complicated FairyTale data structure and a constructor makeTale :: [Person] -> FairyTale, and I want smallCheck to make FairyTale-s from lists of Person-s using the constructor?
I managed to make quickCheck work like this without getting my hands too dirty by using judicious applications of Control.Monad.liftM to functions like makeTale. I couldn't figure out a way to do this with smallCheck (please explain it to me!).
What is the relationship between the types Serial, Series, etc.?
(optional) What is the point of coSeries? How do I use the Positive type from SmallCheck.Series?
(optional) Any elucidation of what is the logic behind what should be a monadic expression, and what is just a regular function, in the context of smallCheck, would be appreciated.
If there is there any intro/tutorial to using smallCheck, I'd appreciate a link. Thank you very much!
UPDATE: I should add that the most useful and readable documentation I found for smallCheck is this paper (PDF). I could not find the answer to my questions there on the first look; it is more of a persuasive advertisement than a tutorial.
UPDATE 2: I moved my question about the weird Identity that shows up in the type of Test.SmallCheck.list and other places to a separate question.
NOTE: This answer describes pre-1.0 versions of SmallCheck. See this blog post for the important differences between SmallCheck 0.6 and 1.0.
SmallCheck is like QuickCheck in that it tests a property over some part of the space of possible types. The difference is that it tries to exhaustively enumerate a series all of the "small" values instead of an arbitrary subset of smallish values.
As I hinted, SmallCheck's Serial is like QuickCheck's Arbitrary.
Now Serial is pretty simple: a Serial type a has a way (series) to generate a Series type which is just a function from Depth -> [a]. Or, to unpack that, Serial objects are objects we know how to enumerate some "small" values of. We are also given a Depth parameter which controls how many small values we should generate, but let's ignore it for a minute.
instance Serial Bool where series _ = [False, True]
instance Serial Char where series _ = "abcdefghijklmnopqrstuvwxyz"
instance Serial a => Serial (Maybe a) where
series d = Nothing : map Just (series d)
In these cases we're doing nothing more than ignoring the Depth parameter and then enumerating "all" possible values for each type. We can even do this automatically for some types
instance (Enum a, Bounded a) => Serial a where series _ = [minBound .. maxBound]
This is a really simple way of testing properties exhaustively—literally test every single possible input! Obviously there are at least two major pitfalls, though: (1) infinite data types will lead to infinite loops when testing and (2) nested types lead to exponentially larger spaces of examples to look through. In both cases, SmallCheck gets really large really quickly.
So that's the point of the Depth parameter—it lets the system ask us to keep our Series small. From the documentation, Depth is the
Maximum depth of generated test values
For data values, it is the depth of nested constructor applications.
For functional values, it is both the depth of nested case analysis and the depth of results.
so let's rework our examples to keep them Small.
instance Serial Bool where
series 0 = []
series 1 = [False]
series _ = [False, True]
instance Serial Char where
series d = take d "abcdefghijklmnopqrstuvwxyz"
instance Serial a => Serial (Maybe a) where
-- we shrink d by one since we're adding Nothing
series d = Nothing : map Just (series (d-1))
instance (Enum a, Bounded a) => Serial a where series d = take d [minBound .. maxBound]
Much better.
So what's coseries? Like coarbitrary in the Arbitrary typeclass of QuickCheck, it lets us build a series of "small" functions. Note that we're writing the instance over the input type---the result type is handed to us in another Serial argument (that I'm below calling results).
instance Serial Bool where
coseries results d = [\cond -> if cond then r1 else r2 |
r1 <- results d
r2 <- results d]
these take a little more ingenuity to write and I'll actually refer you to use the alts methods which I'll describe briefly below.
So how can we make some Series of Persons? This part is easy
instance Series Person where
series d = SnowWhite : take (d-1) (map Dwarf [1..7])
...
But our coseries function needs to generate every possible function from Persons to something else. This can be done using the altsN series of functions provided by SmallCheck. Here's one way to write it
coseries results d = [\person ->
case person of
SnowWhite -> f 0
Dwarf n -> f n
| f <- alts1 results d ]
The basic idea is that altsN results generates a Series of N-ary function from N values with Serial instances to the Serial instance of Results. So we use it to create a function from [0..7], a previously defined Serial value, to whatever we need, then we map our Persons to numbers and pass 'em in.
So now that we have a Serial instance for Person, we can use it to build more complex nested Serial instances. For "instance", if FairyTale is a list of Persons, we can use the Serial a => Serial [a] instance alongside our Serial Person instance to easily create a Serial FairyTale:
instance Serial FairyTale where
series = map makeFairyTale . series
coseries results = map (makeFairyTale .) . coseries results
(the (makeFairyTale .) composes makeFairyTale with each function coseries generates, which is a little confusing)
If I have a data type data Person = SnowWhite | Dwarf Integer, how do I explain to smallCheck that the valid values are Dwarf 1 through Dwarf 7 (or SnowWhite)?
First of all, you need to decide which values you want to generate for each depth. There's no single right answer here, it depends on how fine-grained you want your search space to be.
Here are just two possible options:
people d = SnowWhite : map Dwarf [1..7] (doesn't depend on the depth)
people d = take d $ SnowWhite : map Dwarf [1..7] (each unit of depth increases the search space by one element)
After you've decided on that, your Serial instance is as simple as
instance Serial m Person where
series = generate people
We left m polymorphic here as we don't require any specific structure of the underlying monad.
What if I have a complicated FairyTale data structure and a constructor makeTale :: [Person] -> FairyTale, and I want smallCheck to make FairyTale-s from lists of Person-s using the constructor?
Use cons1:
instance Serial m FairyTale where
series = cons1 makeTale
What is the relationship between the types Serial, Series, etc.?
Serial is a type class; Series is a type. You can have multiple Series of the same type — they correspond to different ways to enumerate values of that type. However, it may be arduous to specify for each value how it should be generated. The Serial class lets us specify a good default for generating values of a particular type.
The definition of Serial is
class Monad m => Serial m a where
series :: Series m a
So all it does is assigning a particular Series m a to a given combination of m and a.
What is the point of coseries?
It is needed to generate values of functional types.
How do I use the Positive type from SmallCheck.Series?
For example, like this:
> smallCheck 10 $ \n -> n^3 >= (n :: Integer)
Failed test no. 5.
there exists -2 such that
condition is false
> smallCheck 10 $ \(Positive n) -> n^3 >= (n :: Integer)
Completed 10 tests without failure.
Any elucidation of what is the logic behind what should be a monadic expression, and what is just a regular function, in the context of smallCheck, would be appreciated.
When you are writing a Serial instance (or any Series expression), you work in the Series m monad.
When you are writing tests, you work with simple functions that return Bool or Property m.
While I think that #tel's answer is an excellent explanation (and I wish smallCheck actually worked the way he describes), the code he provides does not work for me (with smallCheck version 1). I managed to get the following to work...
UPDATE / WARNING: The code below is wrong for a rather subtle reason. For the corrected version, and details, please see this answer to the question mentioned below. The short version is that instead of instance Serial Identity Person one must write instance (Monad m) => Series m Person.
... but I find the use of Control.Monad.Identity and all the compiler flags bizarre, and I have asked a separate question about that.
Note also that while Series Person (or actually Series Identity Person) is not actually exactly the same as functions Depth -> [Person] (see #tel's answer), the function generate :: Depth -> [a] -> Series m a converts between them.
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses, FlexibleContexts, UndecidableInstances #-}
import Test.SmallCheck
import Test.SmallCheck.Series
import Control.Monad.Identity
data Person = SnowWhite | Dwarf Int
instance Serial Identity Person where
series = generate (\d -> SnowWhite : take (d-1) (map Dwarf [1..7]))

QuickCheck: Arbitrary instances of nested data structures that generate balanced specimens

tl;dr: how do you write instances of Arbitrary that don't explode if your data type allows for way too much nesting? And how would you guarantee these instances produce truly random specimens of your data structure?
I want to generate random tree structures, then test certain properties of these structures after I've mangled them with my library code. (NB: I'm writing an implementation of a subtyping algorithm, i.e. given a hierarchy of types, is type A a subtype of type B. This can be made arbitrarily complex, by including multiple-inheritance and post-initialization updates to the hierarchy. The classical method that supports neither of these is Schubert Numbering, and the latest result known to me is Alavi et al. 2008.)
Let's take the example of rose-trees, following Data.Tree:
data Tree a = Node a (Forest a)
type Forest a = [Tree a]
A very simple (and don't-try-this-at-home) instance of Arbitray would be:
instance (Arbitrary a) => Arbitrary (Tree a) where
arbitrary = Node <$> arbitrary <$> arbitrary
Since a already has an Arbitrary instance as per the type constraint, and the Forest will have one, because [] is an instance, too, this seems straight-forward. It won't (typically) terminate for very obvious reasons: since the lists it generates are arbitrarily long, the structures become too large, and there's a good chance they won't fit into memory. Even a more conservative approach:
arbitrary = Node <$> arbitrary <*> oneof [arbitrary,return []]
won't work, again, for the same reason. One could tweak the size parameter, to keep the length of the lists down, but even that won't guarantee termination, since it's still multiple consecutive dice-rolls, and it can turn out quite badly (and I want the odd node with 100 children.)
Which means I need to limit the size of the entire tree. That is not so straight-forward. unordered-containers has it easy: just use fromList. This is not so easy here: How do you turn a list into a tree, randomly, and without incurring bias one way or the other (i.e. not favoring left-branches, or trees that are very left-leaning.)
Some sort of breadth-first construction (the functions provided by Data.Tree are all pre-order) from lists would be awesome, and I think I could write one, but it would turn out to be non-trivial. Since I'm using trees now, but will use even more complex stuff later on, I thought I might try to find a more general and less complex solution. Is there one, or will I have to resort to writing my own non-trivial Arbitrary generator? In the latter case, I might actually just resort to unit-tests, since this seems too much work.
Use sized:
instance Arbitrary a => Arbitrary (Tree a) where
arbitrary = sized arbTree
arbTree :: Arbitrary a => Int -> Gen (Tree a)
arbTree 0 = do
a <- arbitrary
return $ Node a []
arbTree n = do
(Positive m) <- arbitrary
let n' = n `div` (m + 1)
f <- replicateM m (arbTree n')
a <- arbitrary
return $ Node a f
(Adapted from the QuickCheck presentation).
P.S. Perhaps this will generate overly balanced trees...
You might want to use the library presented in the paper "Feat: Functional Enumeration of Algebraic Types" at the Haskell Symposium 2012. It is on Hackage as testing-feat, and a video of the talk introducing it is available here: http://www.youtube.com/watch?v=HbX7pxYXsHg
As Janis mentioned, you can use the package testing-feat, which creates enumerations of arbitrary algebraic data types. This is the easiest way to create unbiased uniformly distributed generators
for all trees of up to a given size.
Here is how you would use it for rose trees:
import Test.Feat (Enumerable(..), uniform, consts, funcurry)
import Test.Feat.Class (Constructor)
import Data.Tree (Tree(..))
import qualified Test.QuickCheck as QC
-- We make an enumerable instance by listing all constructors
-- for the type. In this case, we have one binary constructor:
-- Node :: a -> [Tree a] -> Tree a
instance Enumerable a => Enumerable (Tree a) where
enumerate = consts [binary Node]
where
binary :: (a -> b -> c) -> Constructor c
binary = unary . funcurry
-- Now we use the Enumerable instance to create an Arbitrary
-- instance with the help of the function:
-- uniform :: Enumerable a => Int -> QC.Gen a
instance Enumerable a => QC.Arbitrary (Tree a) where
QC.arbitrary = QC.sized uniform
-- QC.shrink = <some implementation>
The Enumerable instance can also be generated automatically with TemplateHaskell:
deriveEnumerable ''Tree

Variables in Haskell

Why does the following Haskell script not work as expected?
find :: Eq a => a -> [(a,b)] -> [b]
find k t = [v | (k,v) <- t]
Given find 'b' [('a',1),('b',2),('c',3),('b',4)], the interpreter returns [1,2,3,4] instead of [2,4]. The introduction of a new variable, below called u, is necessary to get this to work:
find :: Eq a => a -> [(a,b)] -> [b]
find k t = [v | (u,v) <- t, k == u]
Does anyone know why the first variant does not produce the desired result?
From the Haskell 98 Report:
As usual, bindings in list
comprehensions can shadow those in
outer scopes; for example:
[ x | x <- x, x <- x ] = [ z | y <- x, z <- y]
One other point: if you compile with -Wall (or specifically with -fwarn-name-shadowing) you'll get the following warning:
Warning: This binding for `k' shadows the existing binding
bound at Shadowing.hs:4:5
Using -Wall is usually a good idea—it will often highlight what's going on in potentially confusing situations like this.
The pattern match (k,v) <- t in the first example creates two new local variables v and k that are populated with the contents of the tuple t. The pattern match doesn't compare the contents of t against the already existing variable k, it creates a new variable k (which hides the outer one).
Generally there is never any "variable substitution" happening in a pattern, any variable names in a pattern always create new local variables.
You can only pattern match on literals and constructors.
You can't match on variables.
Read more here.
That being said, you may be interested in view patterns.