Variables in Haskell - variables

Why does the following Haskell script not work as expected?
find :: Eq a => a -> [(a,b)] -> [b]
find k t = [v | (k,v) <- t]
Given find 'b' [('a',1),('b',2),('c',3),('b',4)], the interpreter returns [1,2,3,4] instead of [2,4]. The introduction of a new variable, below called u, is necessary to get this to work:
find :: Eq a => a -> [(a,b)] -> [b]
find k t = [v | (u,v) <- t, k == u]
Does anyone know why the first variant does not produce the desired result?

From the Haskell 98 Report:
As usual, bindings in list
comprehensions can shadow those in
outer scopes; for example:
[ x | x <- x, x <- x ] = [ z | y <- x, z <- y]
One other point: if you compile with -Wall (or specifically with -fwarn-name-shadowing) you'll get the following warning:
Warning: This binding for `k' shadows the existing binding
bound at Shadowing.hs:4:5
Using -Wall is usually a good idea—it will often highlight what's going on in potentially confusing situations like this.

The pattern match (k,v) <- t in the first example creates two new local variables v and k that are populated with the contents of the tuple t. The pattern match doesn't compare the contents of t against the already existing variable k, it creates a new variable k (which hides the outer one).
Generally there is never any "variable substitution" happening in a pattern, any variable names in a pattern always create new local variables.

You can only pattern match on literals and constructors.
You can't match on variables.
Read more here.
That being said, you may be interested in view patterns.

Related

Is it possible to use a comprehension list and index variable names? [duplicate]

Consider a dictionary d in Julia which contains a thousand of keys. Each key is a symbol and each value is an array. I can access to the value associated with the symbol :S1 and assign it to the variable k1 via
k1 = d[:S1]
Now assume I want to define the new variables k2, k3, k4, ..., k10 by repeating the same procedure for the special keys :S1 ... :S10 (not for all the keys in the dictionary). What is the most efficient way to do it? I have the impression this can be solved using metaprogramming but not sure about that.
The easy way is to use Parameters.jl.
using Parameters
d = Dict{Symbol,Any}(:a=>5.0,:b=>2,:c=>"Hi!")
#unpack a, c = d
a == 5.0 #true
c == "Hi!" #true
BTW, this doesn't use eval.
If the special keys are all known at compile time, I suggest using Chris Rackauckas's answer
It is much less evil and works to create local variables.
If for some reason they are only known at runtime,
then you can do as follows.
(Though I guess it is actually pretty strange to need to create variables who's name you don't even know at compile time)
#eval is your friend* here. See the manual
#eval $key = $value
Or you can use the functional form eval() taking a quoted expression:
eval(:($key = $value))
Note however you can not use this to introduce new local variables.
eval always executes at module scope.
And that is a intentional restriction for performance reasons
julia> d = Dict(k => rand(3) for k in [:a, :b1, :c2, :c1])
Dict{Symbol,Array{Float64,1}} with 4 entries:
:a => [0.446723, 0.0853543, 0.476118]
:b1 => [0.212369, 0.846363, 0.854601]
:c1 => [0.542332, 0.885369, 0.635742]
:c2 => [0.118641, 0.987508, 0.578754]
julia> for (k,v) in d
#create constants for only ones starting with `c`
if first(string(k)) == c
#eval const $k = $v
end
end
julia> c2
3-element Array{Float64,1}:
0.118641
0.987508
0.578754
*Honestly eval is not your friend.
It is however the only dude badass enough to walk with you down this dark road of generating code based on runtime values. (#generate is only marginally less badass being willing to generate code based on runtime Types).
If you are in this situation where you absolutely have to generate code based on runtime information consider whether you have not made a design mistake several forks further up the road.
In case you really want to have k1, k2, ... , k10, ... you could use a little more complicated eval than Lyndon's:
for (i,j) in enumerate(d)
#eval $(Symbol("k$i")) = $j.second
end
Warning: eval() use global scope so even if you use this inside a function k1...kn will be global variables.

Idiomatic way of listing elements of a sum type in Idris

I have a sum type representing arithmetic operators:
data Operator = Add | Substract | Multiply | Divide
and I'm trying to write a parser for it. For that, I would need an exhaustive list of all the operators.
In Haskell I would use deriving (Enum, Bounded) like suggested in the following StackOverflow question: Getting a list of all possible data type values in Haskell
Unfortunately, there doesn't seem to be such a mechanism in Idris as suggested by Issue #19. There is some ongoing work by David Christiansen on the question so hopefully the situation will improve in the future : david-christiansen/derive-all-the-instances
Coming from Scala, I am used to listing the elements manually, so I pretty naturally came up with the following:
Operators : Vect 4 Operator
Operators = [Add, Substract, Multiply, Divide]
To make sure that Operators contains all the elements, I added the following proof:
total
opInOps : Elem op Operators
opInOps {op = Add} = Here
opInOps {op = Substract} = There Here
opInOps {op = Multiply} = There (There Here)
opInOps {op = Divide} = There (There (There Here))
so that if I add an element to Operator without adding it to Operators, the totality checker complains:
Parsers.opInOps is not total as there are missing cases
It does the job but it is a lot of boilerplate.
Did I miss something? Is there a better way of doing it?
There is an option of using such feature of the language as elaborator reflection to get the list of all constructors.
Here is a pretty dumb approach to solving this particular problem (I'm posting this because the documentation at the moment is very scarce):
%language ElabReflection
data Operator = Add | Subtract | Multiply | Divide
constrsOfOperator : Elab ()
constrsOfOperator =
do (MkDatatype _ _ _ constrs) <- lookupDatatypeExact `{Operator}
loop $ map fst constrs
where loop : List TTName -> Elab ()
loop [] =
do fill `([] : List Operator); solve
loop (c :: cs) =
do [x, xs] <- apply `(List.(::) : Operator -> List Operator -> List Operator) [False, False]
solve
focus x; fill (Var c); solve
focus xs
loop cs
allOperators : List Operator
allOperators = %runElab constrsOfOperator
A couple comments:
It seems that to solve this problem for any inductive datatype of a similar structure one would need to work through the Elaborator Reflection: Extending Idris in Idris paper.
Maybe the pruviloj library has something that might make solving this problem for a more general case easier.

Where is the Idris == operator useful?

As a beginner in type-driven programming, I'm curious about the use of the == operator. Examples demonstrate that it's not sufficient to prove equality between two values of a certain type, and special equality checking types are introduced for the particular data types. In that case, where is == useful at all?
(==) (as the single constituent function of the Eq interface) is a function from a type T to Bool, and is good for equational reasoning. Whereas x = y (where x : T and y : T) AKA "intensional equality" is itself a type and therefore a proposition. You can and often will want to bounce back and forth between the two different ways of expressing equality for a particular type.
x == y = True is also a proposition, and is often an intermediate step between reasoning about (==) and reasoning about =.
The exact relationship between the two types of equality is rather complex, and you can read https://github.com/pdorrell/learning-idris/blob/9d3454a77f6e21cd476bd17c0bfd2a8a41f382b7/finished/EqFromEquality.idr for my own attempt to understand some aspects of it. (One thing to note is that even though an inductively defined type will have decideable intensional equality, you still have to go through a few hoops to prove that, and a few more hoops to define a corresponding implementation of Eq.)
One particular handy code snippet is this:
-- for rel x y, provide both the computed value, and the proposition that it is equal to the value (as a dependent pair)
has_value_dpair : (rel : t -> t -> Bool) -> (x : t) -> (y : t) -> (value: Bool ** rel x y = value)
has_value_dpair rel x y = (rel x y ** Refl)
You can use it with the with construct when you have a value returned from rel x y and you want to reason about the proposition rel x y = True or rel x y = False (and rel is some function that might represent a notion of equality between x and y).
(In this answer I assume the case where (==) corresponds to =, but you are entirely free to define a (==) function that doesn't correspond to =, eg when defining a Setoid. So that's another reason to use (==) instead of =.)
You still need good old equality because sometimes you can't prove things. Sometimes you don't even need to prove. Consider next example:
countEquals : Eq a => a -> List a -> Nat
countEquals x = length . filter (== x)
You might want to just count number of equal elements to show some statistics to user. Another example: tests. Yes, even with strong type system and dependent types you might want to perform good old unit tests. So you want to check for expectations and this is rather convenient to do with (==) operator.
I'm not going to write full list of cases where you might need (==). Equality operator is not enough for proving but you don't always need proofs.

Test.QuickCheck: speed up testing multiple properties for the same type

I am testing a random generator generating instances of my own type. For that I have a custom instance of Arbitrary:
complexGenerator :: (RandomGen g) => g -> (MyType, g)
instance Arbitrary MyType where
arbitrary = liftM (fst . complexGenerator . mkStdGen) arbitrary
This works well with Test.QuickCheck (actually, Test.Framework) for testing that the generated values hold certain properties. However, there are quite a few properties I want to check, and the more I add, the more time it takes to verify them all.
Is there a way to use the same generated values for testing every property, instead of generating them anew each time? I obviously still want to see, on failures, which property did not hold, so making one giant property with and is not optimal.
I obviously still want to see, on failures, which property did not hold, so making one giant property with and is not optimal.
You could label each property using printTestCase before making a giant property with conjoin.
e.g. you were thinking this would be a bad idea:
prop_giant :: MyType -> Bool
prop_giant x = and [prop_one x, prop_two x, prop_three x]
this would be as efficient yet give you better output:
prop_giant :: MyType -> Property
prop_giant x = conjoin [printTestCase "one" $ prop_one x,
printTestCase "two" $ prop_two x,
printTestCase "three" $ prop_three x]
(Having said that, I've never used this method myself and am only assuming it will work; conjoin is probably marked as experimental in the documentation for a reason.)
In combination with the voted answer, what I've found helpful is using a Reader transformer with the Writer monad:
type Predicate r = ReaderT r (Writer String) Bool
The Reader "shared environment" is the tested input in this case. Then you can compose properties like this:
inv_even :: Predicate Int
inv_even = do
lift . tell $ "this is the even invariant"
(==) 0 . flip mod 2 <$> ask
toLabeledProp :: r -> Predicate r -> Property
toLabeledProp cause r =
let (effect, msg) = runWriter . (runReaderT r) $ cause in
printTestCase ("inv: " ++ msg) . property $ effect
and combining:
fromPredicates :: [Predicate r] -> r -> Property
fromPredicates predicates cause =
conjoin . map (toLabeledProp cause) $ predicates
I suspect there is another approach involving something similar to Either or a WriterT here- which would concisely compose predicates on different types into one result. But at the least, this allows for documenting properties which impose different post-conditions dependent on the the value of the input.
Edit: This idea spawned a library:
http://github.com/jfeltz/quickcheck-property-comb

vector of variable names in R

I'd like to create a function that automatically generates uni and multivariate regression analyses, but I'm not able to figure out how I can specify **variables in vectors...**This seems very easy, but skimming the documentation I havent figured it out so far...
Easy example
a<-rnorm(100)
b<-rnorm(100)
k<-c("a","b")
d<-c(a,b)
summary(k[1])
But k[1]="a" and is a character vector...d is just b appended to a, not the variable names. In effect I'd like k[1] to represent the vector a.
Appreciate any answers...
//M
You can use the "get" function to get an object based on a character string of its name, but in the long run it is better to store the variables in a list and just access them that way, things become much simpler, you can grab subsets, you can use lapply or sapply to run the same code on every element. When saving or deleting you can just work on the entire list rather than trying to remember every element. e.g.:
mylist <- list(a=rnorm(100), b=rnorm(100) )
names(mylist)
summary(mylist[[1]])
# or
summary(mylist[['a']])
# or
summary(mylist$a)
# or
d <- 'a'
summary(mylist[[d]])
# or
lapply( mylist, summary )
If you are programatically creating models for analysis with lm (or other modeling functions), then one approach is to just subset your data and use the ".", e.g.:
yvar <- 'Sepal.Width'
xvars <- c('Petal.Width','Sepal.Length')
fit <- lm( Sepal.Width ~ ., data=iris[, c(yvar,xvars)] )
Or you can build the formula using "paste" or "sprintf" then use "as.formula" to convert it to a formula, e.g.:
yvar <- 'Sepal.Width'
xvars <- c('Petal.Width','Sepal.Length')
my.formula <- paste( yvar, '~', paste( xvars, collapse=' + ' ) )
my.formula <- as.formula(my.formula)
fit <- lm( my.formula, data=iris )
Note also the problem of multiple comparisons if you are looking at many different models fit automatically.
you could use a list k=list(a,b). This creates a list with components a and b but is not a list of variable names.
get() is what you're looking for :
summary(get(k[1]))
edit :
get() is not what you're looking for, it's list(). get() could be useful too though.
If you're looking for automatic generation of regression analyses, you might actually benefit from using eval(), although every R-programmer will warn you about using eval() unless you know very well what you're doing. Please read the help files about eval() and parse() very carefully before you use them.
An example :
d <- data.frame(
var1 = rnorm(1000),
var2 = rpois(1000,4),
var3 = sample(letters[1:3],1000,replace=T)
)
vars <- names(d)
auto.lm <- function(d,dep,indep){
expr <- paste(
"out <- lm(",
dep,
"~",
paste(indep,collapse="*"),
",data=d)"
)
eval(parse(text=expr))
return(out)
}
auto.lm(d,vars[1],vars[2:3])