Infinite Type error while attempting exercise in Binary Tree example - elm

In working through Exercise 2 here, I offered this solution to the compiler and got an Infinite Type error.
flatten : Tree a -> List a
flatten tree =
case tree of
Empty -> []
Node v left right ->
[v] :: flatten left :: flatten right
This doesn't seem too different from my solution to the first exercise:
sum : Tree Int -> Int
sum tree =
case tree of
Empty -> 0
Node v left right ->
v + sum left + sum right
I wondered if perhaps the issue had to do with order of operations, so I added parens to ensure flatten gets evaluated before ::, but this doesn't seem to make a difference:
flatten : Tree a -> List a
flatten tree =
case tree of
Empty -> []
Node v left right ->
[v] :: (flatten left) :: (flatten right)
So now I'm just stumped.

:: is the cons operator, which means it will prepend a single element onto a list. Its type signature is a -> List a -> List a. That means this isn't valid code since the first argument, [v] is a list:
[v] :: flatten left :: flatten right -- invalid!
If you want to concatenate two lists, you use the concatenation operator: ++. You could just replace :: with ++ in your example to get it to compile:
[v] ++ flatten left ++ flatten right
Another way to represent that line is to concatenate the two lists, then prepend the list with v using the cons operator.
v :: flatten left ++ flatten right
-- The following is the same as above, but with parentheses showing precedence
v :: (flatten left ++ flatten right)
There are more efficient ways to do this, of course, but it highlights the difference between cons and concatenation.
The reason your sum example works is because it is returning an int instead of a list of ints. The type you are returning in sum is the same as the value in the tree, so you end up with an aggregate, not another list.

Related

List split in Elm

Write a function to split a list into two lists. The length of the first part is specified by the caller.
I am new to Elm so I am not sure if my reasoning is correct. I think that I need to transform the input list in an array so I am able to slice it by the provided input number. I am struggling a bit with the syntax as well. Here is my code so far:
listSplit: List a -> Int -> List(List a)
listSplit inputList nr =
let myArray = Array.fromList inputList
in Array.slice 0 nr myArray
So I am thinking to return a list containing 2 lists(first one of the specified length), but I am stuck in the syntax. How can I fix this?
Alternative implementation:
split : Int -> List a -> (List a, List a)
split i xs =
(List.take i xs, List.drop i xs)
I'll venture a simple recursive definition, since a big part of learning functional programming is understanding recursion (which foldl is just an abstraction of):
split : Int -> List a -> (List a, List a)
split splitPoint inputList =
splitHelper splitPoint inputList []
{- We use a typical trick here, where we define a helper function
that requires some additional arguments. -}
splitHelper : Int -> List a -> List a -> (List a, List a)
splitHelper splitPoint inputList leftSplitList =
case inputList of
[] ->
-- This is a base case, we end here if we ran out of elements
(List.reverse leftSplitList, [])
head :: tail ->
if splitPoint > 0 then
-- This is the recursive case
-- Note the typical trick here: we are shuffling elements
-- from the input list and putting them onto the
-- leftSplitList.
-- This will reverse the list, so we need to reverse it back
-- in the base cases
splitHelper (splitPoint - 1) tail (head :: leftSplitList)
else
-- here we got to the split point,
-- so the rest of the list is the output
(List.reverse leftSplitList, inputList)
Use List.foldl
split : Int -> List a -> (List a, List a)
split i xs =
let
f : a -> (List a, List a) -> (List a, List a)
f x (p, q) =
if List.length p >= i then
(p, q++[x])
else
(p++[x], q)
in
List.foldl f ([], []) xs
When list p reaches the desired length, append element x to the second list q.
Append element x to list p otherwise.
Normally in Elm, you use List for a sequence of values. Array is used specifically for fast indexing access.
When dealing with lists in functional programming, try to think in terms of map, filter, and fold. They should be all you need.
To return a pair of something (e.g. two lists), use tuple. Elm supports tuples of up to three elements.
Additionally, there is a function splitAt in the List.Extra package that does exactly the same thing, although it is better to roll your own for the purpose of learning.

Removing duplicated from sorted list (OCaml)

I'm trying to remove duplicate items from already sorted list in OCaml. This is my code:
let rec remove_dup = function
| [] -> []
| hd :: [] -> hd :: []
| hd :: hd2 :: tl -> if (hd == hd2) (remove_dup tl) :: hd else (remove_dup (h2 :: tl) :: hd;;
I'm getting a syntax error.
The OCaml if looks like if expr1 then expr2 else expr3. You're missing the keyword then.
You also have unbalanced parentheses. It looks like you need a right parenthesis at the very end.
After these fixes you have some type errors that you should look at.
As a side comment, don't use == to test for equality. It's a special-purpose operator for advanced uses. The day-to-day equality operator is =.

Why is filter based on dependent pair?

In the Idris Tutorial a function for filtering vectors is based on dependent pairs.
filter : (a -> Bool) -> Vect n a -> (p ** Vect p a)
filter f [] = (_ ** [])
filter f (x :: xs) with (filter f xs )
| (_ ** xs') = if (f x) then (_ ** x :: xs') else (_ ** xs')
But why is it necessary to put this in terms of a dependent pair instead of something more direct such as?
filter' : (a -> Bool) -> Vect n a -> Vect p a
In both cases the type of p must be determined, but in my supposed alternative the redundancy of listing p twice is eliminated.
My naive attempts at implementing filter' failed, so I was wondering is there a fundamental reason that it can't be implemented? Or can filter' be implemented, and perhaps filter was just a poor example to showcase dependent pairs in Idris? But if that is the case then in what situations would dependent pairs be useful?
Thanks!
The difference between filter and filter' is between existential and universal quantification. If (a -> Bool) -> Vect n a -> Vect p a was the correct type for filter, that would mean filter returns a Vector of length p and the caller can specify what p should be.
Kim Stebel's answer is right on the money. Let me just note that this was already discussed on the Idris mailing list back in 2012 (!!):
filter for vector, a question - Idris Programming Language
What raichoo posted there can help clarifying it I think; the real signature of your filter' is
filter' : {p : Nat} -> {n: Nat} -> {a: Type} -> (a -> Bool) -> Vect a n -> Vect a p
from which it should be obvious that this is not what filter should (or even could) do; p actually depends on the predicate and the vector you are filtering, and you can (actually need to) express this using a dependent pair. Note that in the pair (p ** Vect p a), p (and thus Vect p a) implicitly depends on the (unnamed) predicate and vector appearing before it in its signature.
Expanding on this, why a dependent pair? You want to return a vector, but there's no "Vector with unknown length" type; you need a length value for obtaining a Vector type. But then you can just think "OK, I will return a Nat together with a vector with that length". The type of this pair is, unsurprisingly, an example of a dependent pair. In more detail, a dependent pair DPair a P is a type built out of
A type a
A function P: a -> Type
A value of that type DPair a P is a pair of values
x: a
y: P a
At this point I think that is just syntax what might be misleading you. The type p ** Vect p a is DPair Nat (\p => Vect p a); p there is not a parameter for filter or anything like it. All this can be a bit confusing at first; if so, maybe it helps thinking of p ** Vect p a as a substitute for the "Vector with unknown length" type.
Not an answer, but additional context
Idris 1 documentation - https://docs.idris-lang.org/en/latest/tutorial/typesfuns.html#dependent-pairs
Idris 2 documentation - https://idris2.readthedocs.io/en/latest/tutorial/typesfuns.html?highlight=dependent#dependent-pairs
In Idris 2 the dependent pair defined here
and is similar to Exists and Subset but BOTH of it's values are NOT erased at runtime

Find all Change Combinations (money) in OCaml

I have a some OCaml code that finds all combinations of change given a change amount. I have most of the code working, however I am not able to figure out how this recursive function will actually return the possible change combinations.
let change_combos presidents =
let rec change amount coinlist = match amount with
|0 -> [[]] (*exits when nothing*)
|_ when (amount < 0) -> [] (*exits when less than 0*)
|_ -> match coinlist with
|[] -> [] (*Returns empty list, exits program*)
(*h::f -> something, using [25;10;5;1] aka all change combinations...*)
(*using recursion, going through all combinations and joining lists returned together*)
let print_the_coin_matrix_for_all_our_joy enter_the_matrix =
print_endline (join "\n" (List.map array_to_string enter_the_matrix));;
Thanks for the help, let me know if I need to clarify something :)
It is a bit confusing what you're looking for. I believe that you want to generate a list of all the combinations of a list? You should think about the recursion and how to generate the individual elements. Start with the input type, and how you'd generate successive elements by reducing the problem space.
let rec generate lst = match lst with
| [] -> []
| h::t -> [h] :: (List.map (fun x -> h::x) (generate t)) # (generate t)
If the list is [] there are no combinations. If we have an element we generate all combinations without that element and base our construction on that assumption. The components fall into place at this point. Concatenate the list of combinations of t with the combinations of t and h cons'd onto each and a singleton of h.

foldl is tail recursive, so how come foldr runs faster than foldl?

I wanted to test foldl vs foldr. From what I've seen you should use foldl over foldr when ever you can due to tail reccursion optimization.
This makes sense. However, after running this test I am confused:
foldr (takes 0.057s when using time command):
a::a -> [a] -> [a]
a x = ([x] ++ )
main = putStrLn(show ( sum (foldr a [] [0.. 100000])))
foldl (takes 0.089s when using time command):
b::[b] -> b -> [b]
b xs = ( ++ xs). (\y->[y])
main = putStrLn(show ( sum (foldl b [] [0.. 100000])))
It's clear that this example is trivial, but I am confused as to why foldr is beating foldl. Shouldn't this be a clear case where foldl wins?
Welcome to the world of lazy evaluation.
When you think about it in terms of strict evaluation, foldl looks "good" and foldr looks "bad" because foldl is tail recursive, but foldr would have to build a tower in the stack so it can process the last item first.
However, lazy evaluation turns the tables. Take, for example, the definition of the map function:
map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs
This wouldn't be too good if Haskell used strict evaluation, since it would have to compute the tail first, then prepend the item (for all items in the list). The only way to do it efficiently would be to build the elements in reverse, it seems.
However, thanks to Haskell's lazy evaluation, this map function is actually efficient. Lists in Haskell can be thought of as generators, and this map function generates its first item by applying f to the first item of the input list. When it needs a second item, it just does the same thing again (without using extra space).
It turns out that map can be described in terms of foldr:
map f xs = foldr (\x ys -> f x : ys) [] xs
It's hard to tell by looking at it, but lazy evaluation kicks in because foldr can give f its first argument right away:
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Because the f defined by map can return the first item of the result list using solely the first parameter, the fold can operate lazily in constant space.
Now, lazy evaluation does bite back. For instance, try running sum [1..1000000]. It yields a stack overflow. Why should it? It should just evaluate from left to right, right?
Let's look at how Haskell evaluates it:
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
sum = foldl (+) 0
sum [1..1000000] = foldl (+) 0 [1..1000000]
= foldl (+) ((+) 0 1) [2..1000000]
= foldl (+) ((+) ((+) 0 1) 2) [3..1000000]
= foldl (+) ((+) ((+) ((+) 0 1) 2) 3) [4..1000000]
...
= (+) ((+) ((+) (...) 999999) 1000000)
Haskell is too lazy to perform the additions as it goes. Instead, it ends up with a tower of unevaluated thunks that have to be forced to get a number. The stack overflow occurs during this evaluation, since it has to recurse deeply to evaluate all the thunks.
Fortunately, there is a special function in Data.List called foldl' that operates strictly. foldl' (+) 0 [1..1000000] will not stack overflow. (Note: I tried replacing foldl with foldl' in your test, but it actually made it run slower.)
Upon looking at this problem again, I think all current explanations are somewhat insufficient so I've written a longer explanation.
The difference is in how foldl and foldr apply their reduction function. Looking at the foldr case, we can expand it as
foldr (\x -> [x] ++ ) [] [0..10000]
[0] ++ foldr a [] [1..10000]
[0] ++ ([1] ++ foldr a [] [2..10000])
...
This list is processed by sum, which consumes it as follows:
sum = foldl' (+) 0
foldl' (+) 0 ([0] ++ ([1] ++ ... ++ [10000]))
foldl' (+) 0 (0 : [1] ++ ... ++ [10000]) -- get head of list from '++' definition
foldl' (+) 0 ([1] ++ [2] ++ ... ++ [10000]) -- add accumulator and head of list
foldl' (+) 0 (1 : [2] ++ ... ++ [10000])
foldl' (+) 1 ([2] ++ ... ++ [10000])
...
I've left out the details of the list concatenation, but this is how the reduction proceeds. The important part is that everything gets processed in order to minimize list traversals. The foldr only traverses the list once, the concatenations don't require continuous list traversals, and sum finally consumes the list in one pass. Critically, the head of the list is available from foldr immediately to sum, so sum can begin working immediately and values can be gc'd as they are generated. With fusion frameworks such as vector, even the intermediate lists will likely be fused away.
Contrast this to the foldl function:
b xs = ( ++xs) . (\y->[y])
foldl b [] [0..10000]
foldl b ( [0] ++ [] ) [1..10000]
foldl b ( [1] ++ ([0] ++ []) ) [2..10000]
foldl b ( [2] ++ ([1] ++ ([0] ++ [])) ) [3..10000]
...
Note that now the head of the list isn't available until foldl has finished. This means that the entire list must be constructed in memory before sum can begin to work. This is much less efficient overall. Running the two versions with +RTS -s shows miserable garbage collection performance from the foldl version.
This is also a case where foldl' will not help. The added strictness of foldl' doesn't change the way the intermediate list is created. The head of the list remains unavailable until foldl' has finished, so the result will still be slower than with foldr.
I use the following rule to determine the best choice of fold
For folds that are a reduction, use foldl' (e.g. this will be the only/final traversal)
Otherwise use foldr.
Don't use foldl.
In most cases foldr is the best fold function because the traversal direction is optimal for lazy evaluation of lists. It's also the only one capable of processing infinite lists. The extra strictness of foldl' can make it faster in some cases, but this is dependent on how you'll use that structure and how lazy it is.
I don't think anyone's actually said the real answer on this one yet, unless I'm missing something (which may well be true and welcomed with downvotes).
I think the biggest different in this case is that foldr builds the list like this:
[0] ++ ([1] ++ ([2] ++ (... ++ [1000000])))
Whereas foldl builds the list like this:
((([0] ++ [1]) ++ [2]) ++ ... ) ++ [999888]) ++ [999999]) ++ [1000000]
The difference in subtle, but notice that in the foldr version ++ always has only one list element as its left argument. With the foldl version, there are up to 999999 elements in ++'s left argument (on average around 500000), but only one element in the right argument.
However, ++ takes time proportional to the size of the left argument, as it has to look though the entire left argument list to the end and then repoint that last element to the first element of the right argument (at best, perhaps it actually needs to do a copy). The right argument list is unchanged, so it doesn't matter how big it is.
That's why the foldl version is much slower. It's got nothing to do with laziness in my opinion.
The problem is that tail recursion optimization is a memory optimization, not a execution time optimization!
Tail recursion optimization avoids the need to remember values for each recursive call.
So, foldl is in fact "good" and foldr is "bad".
For example, considering the definitions of foldr and foldl:
foldl f z [] = z
foldl f z (x:xs) = foldl f (z `f` x) xs
foldr f z [] = z
foldr f z (x:xs) = x `f` (foldr f z xs)
That's how the expression "foldl (+) 0 [1,2,3]" is evaluated:
foldl (+) 0 [1, 2, 3]
foldl (+) (0+1) [2, 3]
foldl (+) ((0+1)+2) [3]
foldl (+) (((0+1)+2)+3) [ ]
(((0+1)+2)+3)
((1+2)+3)
(3+3)
6
Note that foldl doesn't remember the values 0, 1, 2..., but pass the whole expression (((0+1)+2)+3) as argument lazily and don't evaluates it until the last evaluation of foldl, where it reaches the base case and returns the value passed as the second parameter (z) wich isn't evaluated yet.
On the other hand, that's how foldr works:
foldr (+) 0 [1, 2, 3]
1 + (foldr (+) 0 [2, 3])
1 + (2 + (foldr (+) 0 [3]))
1 + (2 + (3 + (foldr (+) 0 [])))
1 + (2 + (3 + 0)))
1 + (2 + 3)
1 + 5
6
The important difference here is that where foldl evaluates the whole expression in the last call, avoiding the need to come back to reach remembered values, foldr no. foldr remember one integer for each call and performs a addition in each call.
Is important to bear in mind that foldr and foldl are not always equivalents. For instance, try to compute this expressions in hugs:
foldr (&&) True (False:(repeat True))
foldl (&&) True (False:(repeat True))
foldr and foldl are equivalent only under certain conditions described here
(sorry for my bad english)
For a, the [0.. 100000] list needs to be expanded right away so that foldr can start with the last element. Then as it folds things together, the intermediate results are
[100000]
[99999, 100000]
[99998, 99999, 100000]
...
[0.. 100000] -- i.e., the original list
Because nobody is allowed to change this list value (Haskell is a pure functional language), the compiler is free to reuse the value. The intermediate values, like [99999, 100000] can even be simply pointers into the expanded [0.. 100000] list instead of separate lists.
For b, look at the intermediate values:
[0]
[0, 1]
[0, 1, 2]
...
[0, 1, ..., 99999]
[0.. 100000]
Each of those intermediate lists can't be reused, because if you change the end of the list then you've changed any other values that point to it. So you're creating a bunch of extra lists that take time to build in memory. So in this case you spend a lot more time allocating and filling in these lists that are intermediate values.
Since you're just making a copy of the list, a runs faster because it starts by expanding the full list and then just keeps moving a pointer from the back of the list to the front.
Neither foldl nor foldr is tail optimized. It is only foldl'.
But in your case using ++ with foldl' is not good idea because successive evaluation of ++ will cause traversing growing accumulator again and again.
Well, let me rewrite your functions in a way that difference should be obvious -
a :: a -> [a] -> [a]
a = (:)
b :: [b] -> b -> [b]
b = flip (:)
You see that b is more complex than a. If you want to be precise a needs one reduction step for value to be calculated, but b needs two. That makes the time difference you are measuring, in second example twice as much reductions must be performed.
//edit: But time complexity is the same, so I wouldn't bother about it much.