I'm trying to remove elements from a Clojure vector:
Note that I'm using Clojure's operations from Kotlin
val set = PersistentHashSet.create("foo")
val vec = PersistentVector.create("foo", "bar")
val seq = clojure.`core$remove`.invokeStatic(set, vec) as ISeq
val resultVec = clojure.`core$vec`.invokeStatic(seq) as PersistentVector
This is the equivalent of the following Clojure code:
(remove #{"foo"} ["foo" "bar"])
The code works fine but I've noticed that creating a vector from the seq is extrmely slow. I've written a benchmark and these were the results:
| Item count | Remove ms | Remove with converting back to vector ms|
-----------------------------------------------------------------
| 1000 | 51 | 1355 |
| 10000 | 71 | 5123 |
Do you know how I can convert the seq resulting from the remove operation back to a vector without the harsh performance penalty?
If it is not possible is there an alternative way to perform the remove operation?
You could try the complementary operation to remove that returns a vector:
(filterv (complement #{"foo"})
["foo" "bar"])
Note the use of filterv. The v indicates that it uses a vector from the start, and returns a vector, so no conversion is required. It uses a transient vector behind the scenes, so it should be pretty fast.
I'm negating the predicate using complement so I can use filterv, since there is no removev. remove is just defined as the complement of filter anyway though, so it's basically what you were already doing, just strict.
What you are trying to do fundamentally performs badly. Vectors are for fast indexed read/write, and O(1) access to the right end. To do anything else you must tear the vector apart and rebuild it again, an O(N) operation. If you need an operation like this to be efficient, you must use a different data structure.
Why not a PersistentHashSet? Fast removal, though not ordered. I do vaguely recall Clojure also having a sorted set in case that’s needed.
You have made an error of accepting the lazy result of remove as equivalent to the concrete result of converting back to a vector. Compare the lazy result of (remove ...) with the concrete result implied by (count (remove ...)). You will see that it is slightly slower than just doing (vec (remove ...)). Also, for real speed-critical applications, there is nothing like using a native Java ArrayList:
(ns tst.demo.core
(:require
[criterium.core :as crit] )
(:import [java.util ArrayList]))
(def N 1000)
(def tgt-item (/ N 2))
(def pred-set #{ (long tgt-item) })
(def data-vec (vec (range N)))
(def data-al (ArrayList. data-vec))
(def tgt-items (ArrayList. [tgt-item]))
(println :lazy)
(crit/quick-bench
(remove pred-set data-vec))
(println :lazy-count)
(crit/quick-bench
(count (remove pred-set data-vec)))
(println :vec)
(crit/quick-bench
(vec (remove pred-set data-vec)))
(println :ArrayList)
(crit/quick-bench
(let [changed? (.removeAll data-al tgt-items)]
data-al))
with results:
:lazy Evaluation count : 35819946 time mean : 10.856 ns
:lazy-count Evaluation count : 8496 time mean : 69941.171 ns
:vec Evaluation count : 9492 time mean : 62965.632 ns
:ArrayList Evaluation count : 167490 time mean : 3594.586 ns
I have a sum type representing arithmetic operators:
data Operator = Add | Substract | Multiply | Divide
and I'm trying to write a parser for it. For that, I would need an exhaustive list of all the operators.
In Haskell I would use deriving (Enum, Bounded) like suggested in the following StackOverflow question: Getting a list of all possible data type values in Haskell
Unfortunately, there doesn't seem to be such a mechanism in Idris as suggested by Issue #19. There is some ongoing work by David Christiansen on the question so hopefully the situation will improve in the future : david-christiansen/derive-all-the-instances
Coming from Scala, I am used to listing the elements manually, so I pretty naturally came up with the following:
Operators : Vect 4 Operator
Operators = [Add, Substract, Multiply, Divide]
To make sure that Operators contains all the elements, I added the following proof:
total
opInOps : Elem op Operators
opInOps {op = Add} = Here
opInOps {op = Substract} = There Here
opInOps {op = Multiply} = There (There Here)
opInOps {op = Divide} = There (There (There Here))
so that if I add an element to Operator without adding it to Operators, the totality checker complains:
Parsers.opInOps is not total as there are missing cases
It does the job but it is a lot of boilerplate.
Did I miss something? Is there a better way of doing it?
There is an option of using such feature of the language as elaborator reflection to get the list of all constructors.
Here is a pretty dumb approach to solving this particular problem (I'm posting this because the documentation at the moment is very scarce):
%language ElabReflection
data Operator = Add | Subtract | Multiply | Divide
constrsOfOperator : Elab ()
constrsOfOperator =
do (MkDatatype _ _ _ constrs) <- lookupDatatypeExact `{Operator}
loop $ map fst constrs
where loop : List TTName -> Elab ()
loop [] =
do fill `([] : List Operator); solve
loop (c :: cs) =
do [x, xs] <- apply `(List.(::) : Operator -> List Operator -> List Operator) [False, False]
solve
focus x; fill (Var c); solve
focus xs
loop cs
allOperators : List Operator
allOperators = %runElab constrsOfOperator
A couple comments:
It seems that to solve this problem for any inductive datatype of a similar structure one would need to work through the Elaborator Reflection: Extending Idris in Idris paper.
Maybe the pruviloj library has something that might make solving this problem for a more general case easier.
What I've done (with some help from a friend) is create a function that takes a List, Int for the index, and a function to be applied to the element at the specified index. It's similar to Map but instead of applying a function to every element, it applies it to only one element.
So my questions are:
Does this function already exist in the core somewhere? We couldn't find it.
If not, is there a better way of accomplishing this than how we have done it?
Here's the code:
import Html exposing (text)
main =
let
m = {arr=[1,5,3], msg=""}
in
text (toString (getDisplay m 4 (\x -> x + 5)))
type alias Model =
{ arr : List (Int)
, msg : String
}
getDisplay : Model -> Int -> (Int -> Int) -> Model
getDisplay model i f =
let
m = (changeAt model.arr i f)
in
case m of
Ok val ->
{model | arr = val, msg = ""}
Err err ->
{model | arr = [], msg = err}
changeAt : List a -> Int -> (a -> a) -> Result String (List a)
changeAt l i func =
let
f j x = if j==i then func x else x
in
if i < (List.length l) && i >= 0 then
Ok(List.indexedMap f l)
else
Err "Bad index"
NOTE: Elm discourages indexing Lists, as they are linked lists under the hood: to retrieve the 1001th element, you have to first visit all 1000 previous elements. Nonetheless, if you wanted to do it, this is one way.
List.indexedMap is a good way to do what you're describing.
However, since you mention the downside of having to visit all preceding elements in a list, the reality in your example is actually a little worse, if indeed you are super worried about performance.
Your list is actually traversed fully at least two times, regardless of whether the index exists or not. The simple act of asking for the length of a linked list has to traverse the entire list. Check out the source code, length is implemented in terms of a foldl.
Furthermore, List.indexedMap traverses the entire list at least once. I say, at least once, since the source of indexedMap also calls the length function in addition to using map. If we're lucky, the length call is memoized (I'm not familiar enough with Elm internals to know whether it is or not, hence the at least comment). The map itself traverses the entire list when called, unlike Haskell which evaluates things lazily, only as much as necessary.
And if you use indexedMap, the whole list is indexed regardless of the position you are interested in. That is, even if you want to apply the function at index zero, the entire list is indexed.
If you actually want to reduce the number of traversals to a minimum, you're going to (at this time) have to implement your own function and you'll have to do it without relying on length or indexedMap.
Here is an example of a changeAt function which avoids unnecessary traversals and if it finds the position, it stops traversing the list.
changeAt : List a -> Int -> (a -> a) -> Result String (List a)
changeAt l i func =
if i < 0 then
Err "Bad Index"
else
case l of
[] ->
Err "Not found"
(x::xs) ->
if i == 0 then
Ok <| func x :: xs
else
Result.map ((::) x) <| changeAt xs (i - 1) func
It's not terribly pretty, but if you want to avoid unnecessarily walking through the list - multiple times - then you might want to go with something like this.
You're looking for the set function for Arrays. Instead of using a List, which is inefficient as you described, this structure is better suited to your use case.
Here's an efficient implementation of the function you're looking for:
changeAt : Int -> (a -> a) -> Array a -> Array a
changeAt i f array =
case get i array of
Just item ->
set i (f item) array
Nothing ->
array
Also note that the data structure is the last argument in this implementation.
Array is mentioned in the link in your question, but nobody on this thread had explicitly mentioned this option yet.
Give a grammar for the following language {0^n w 1^n | n>=0 w is in {0,1}* and |w|=n}
Attempt at solution:
S--> 0S1|R
R--> 0R|1R|empty
not sure how to guarantee the length of r is the same as the number of 0's or 1's.
Here goes nothing.
Every word should look like this: 0^n w 1^n. So if we have the rule S -> 0S1 we reach a state, where every sentential form we can generate from this looks like 0^n S 0^n.
Well... This is not quite what we want. isn't it?
We go a step further, we want some Variable V, involved in the rules V -> 0|1 (edit: this would have been our "goal", but things could go wrong, so we don't use these two rules), which gives us the possibility to use S -> 0SV1 instead of S-> 0S1. What's the change?
Now we get sentential forms like these: 0^n S (V1)^n. So, for example 000SV1V1V1 would be some such sentential form. One minor addition we definitely need now is the rule S -> empty
Still not quite there yet, though, after all we want it to look more like 0^nV^n1^n in the end. So we add a rule which swaps V and 1. So we add 1V -> V1 to the set of rules. What are the possibilites now? Given a sentential form like 000SV1V1V1 we can now move all the V's to the left and all the 1's to the right.
And now we become real grammar nazis. We don't want anything to go wrong, so we make some minor changes. We do a little swippidy swappidy. We swap every occurence of S we had so far with a T. So S -> 0SV1 becomes 0TV1 etcetera. Also, we add the rules S -> empty|T and we remove the rule T -> empty. What do we gain from this? Well... Nothing at first sight. BUT, now we can build a mechanism which assures that nothing can go wrong when we turn the V's into 1's and 0's.
We simply add the rules TV -> C and CV -> CC. Oh Jesus Christ, all these rules.
Now, given a sentential form 0^n T V^n 1^n, we can slowly transform it into 0^n C^n 1^n. What's the use?
Nothing can go wrong, if we might not have pushed all the V's to the left.
So: A sentential form like 0000CCC1V111 can do no harm to our cause, as we cannot do anything about the V, unless it's next to a C, also, we have no possibility of pushing the C's around, since there is no such rule. Also, since we're going to add the rules C -> 0|1, if we prematurely change them to 1's and 0's, we cannot finish our word if there is still a V floating around.
This might be not neccessary at all, I'm not sure about that, BUT it is part of our proof, that all the words we can create are in the set of words we want to specify with this grammar.
The rules are:
S -> empty | T
T -> 0TV1
1V -> V1
TV -> C
CV -> CC
C -> 0|1
Edit:
This is a Type-0 Grammer, though.
With some changes, this can become an equivalent CSG, though:
S -> empty | T
T -> 0TV1 | 0C1
1V -> V1
CV -> CC
C -> 0 | 1
The main difference is, that we at some point can decide to stop adding 0TV1 to the sentential form, and instead finish up with 0C1, getting a form like 0^n C1 (V1) ^ (n-1). And again, if we prematurely transform all C's into 0's and 1's, we lose the possibility to remove all V's. So this should also generate the set we're looking for.
This is also my first answer to anything on stackoverflow, and since I kinda do like computer science theory, I hope my explanations are not wrong. If so, tell me.
I've been learning f# in the previous days, writing a small project which, at last, works (with the help of SO, of course).
I'm trying to learn to be as idiomatic as possible, which basically means that I try to not mutate my data structures. This is costing me a lot of effort :-)
In my search for idiomatic functional programming, I have been trying to use as much as possible lists, tuples and record, rather than objects. But then "praticality beats purity" and so I'm rewriting my small project using objects this time.
I thought that you could give me some advice, surely my idea of "good functional programming design" is not yet very well defined.
For instance I have to modify the nodes of a tree, modifying at the same time the states at two different levels (L and L+1). I've been able to do that without mutating data, but I needed a lot of "inner" and "helper" functions, with accumulators and so on. The nice feeling of being able to clearly express the algorithm was lost for me, due to the need to modify my data structure in an involved way. This is extremely easy in imperative languages, for instance: just dereference the pointers to the relevant nodes, modify their state and iterate over.
Surely I've not designed properly my structure, and for this reason I'm now trying the OOP approach.
I've looked at SICP, at How to design programs and have found a thesis by C. Okasaki ("Purely functional data structures") but the examples on SICP and HTDP are similar to what I did, or maybe I'm not able to understand them fully. The thesis on the other hand is a bit too hard for me at the moment :-)
What do you think about this "tension" which I am experiencing? Am I interpreting the "never mutate data" too strictly? Could you suggest me some resource?
Thanks in advance,
Francesco
When it comes to 'tree update', I think you can always do it pretty elegantly
using catamorphisms (folds over trees). I have a long blog series about this,
and most of the example code below comes from part 4 of the series.
When first learning, I find it best to focus on a particular small, concrete
problem statement. Based on your description, I invented the following problem:
You have a binary tree, where each node contains a "name" and an "amount" (can
think of it like bank accounts or some such). And I want to write a function
which can tell someone to "steal" a certain amount from each of his direct
children. Here's a picture to describe what I mean:
alt text http://oljksa.bay.livefilestore.com/y1pNWjpCPP6MbI3rMfutskkTveCWVEns5xXaOf-NZlIz2Hs_CowykUmwtlVV7bPXRwh4WHJMT-5hSuGVZEhmAIPuw/FunWithTrees.png
On the left I have an original tree. The middle example shows the result I want if node
'D' is asked to steal '10' from each of his children. And the right example
shows what the desired result is if instead I asked 'F' to steal '30' in the original example.
Note that the tree structure I use will be immutable, and the red colors in the
diagram designate "new tree nodes" relative to the original tree. That is black
nodes are shared with the original tree structure (Object.ReferenceEquals to one
another).
Now, assuming a typical tree structure like
type Tree<'T> = //'
| Node of 'T * Tree<'T> * Tree<'T> //'
| Leaf
we'd represent the original tree as
let origTree = Node(("D",1000),
Node(("B",1000),
Node(("A",1000),Leaf,Leaf),
Node(("C",1000),Leaf,Leaf)),
Node(("F",1000),
Node(("E",1000),Leaf,Leaf),
Leaf))
and the "Steal" function is really easy to write, assuming you have the usual "fold"
boilerplate:
// have 'stealerName' take 'amount' from each of its children and
// add it to its own value
let Steal stealerName amount tree =
let Subtract amount = function
| Node((name,value),l,r) -> amount, Node((name,value-amount),l,r)
| Leaf -> 0, Leaf
tree |> XFoldTree
(fun (name,value) left right ->
if name = stealerName then
let leftAmt, newLeft = Subtract amount left
let rightAmt, newRight = Subtract amount right
XNode((name,value+leftAmt+rightAmt),newLeft,newRight)
else
XNode((name,value), left, right))
XLeaf
// examples
let dSteals10 = Steal "D" 10 origTree
let fSteals30 = Steal "F" 30 origTree
That's it, you're done, you've written an algorithm that "updates" levels L and
L+1 of an immutable tree just by authoring the core logic. Rather than explain
it all here, you should go read my blog series (at least the start: parts one two three four).
Here's all the code (that drew the picture above):
// Tree boilerplate
// See http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!248.entry
type Tree<'T> =
| Node of 'T * Tree<'T> * Tree<'T>
| Leaf
let (===) x y = obj.ReferenceEquals(x,y)
let XFoldTree nodeF leafV tree =
let rec Loop t cont =
match t with
| Node(x,left,right) -> Loop left (fun lacc ->
Loop right (fun racc ->
cont (nodeF x lacc racc t)))
| Leaf -> cont (leafV t)
Loop tree (fun x -> x)
let XNode (x,l,r) (Node(xo,lo,ro) as orig) =
if xo = x && lo === l && ro === r then
orig
else
Node(x,l,r)
let XLeaf (Leaf as orig) =
orig
let FoldTree nodeF leafV tree =
XFoldTree (fun x l r _ -> nodeF x l r) (fun _ -> leafV) tree
// /////////////////////////////////////////
// stuff specific to this problem
let origTree = Node(("D",1000),
Node(("B",1000),
Node(("A",1000),Leaf,Leaf),
Node(("C",1000),Leaf,Leaf)),
Node(("F",1000),
Node(("E",1000),Leaf,Leaf),
Leaf))
// have 'stealerName' take 'amount' from each of its children and
// add it to its own value
let Steal stealerName amount tree =
let Subtract amount = function
| Node((name,value),l,r) -> amount, Node((name,value-amount),l,r)
| Leaf -> 0, Leaf
tree |> XFoldTree
(fun (name,value) left right ->
if name = stealerName then
let leftAmt, newLeft = Subtract amount left
let rightAmt, newRight = Subtract amount right
XNode((name,value+leftAmt+rightAmt),newLeft,newRight)
else
XNode((name,value), left, right))
XLeaf
let dSteals10 = Steal "D" 10 origTree
let fSteals30 = Steal "F" 30 origTree
// /////////////////////////////////////////
// once again,
// see http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!248.entry
// DiffTree: Tree<'T> * Tree<'T> -> Tree<'T * bool>
// return second tree with extra bool
// the bool signifies whether the Node "ReferenceEquals" the first tree
let rec DiffTree(tree,tree2) =
XFoldTree (fun x l r t t2 ->
let (Node(x2,l2,r2)) = t2
Node((x2,t===t2), l l2, r r2)) (fun _ _ -> Leaf) tree tree2
open System.Windows
open System.Windows.Controls
open System.Windows.Input
open System.Windows.Media
open System.Windows.Shapes
// Handy functions to make multiple transforms be a more fluent interface
let IdentT() = new TransformGroup()
let AddT t (tg : TransformGroup) = tg.Children.Add(t); tg
let ScaleT x y (tg : TransformGroup) = tg.Children.Add(new ScaleTransform(x, y)); tg
let TranslateT x y (tg : TransformGroup) = tg.Children.Add(new TranslateTransform(x, y)); tg
// Draw: Canvas -> Tree<'T * bool> -> unit
let Draw (canvas : Canvas) tree =
// assumes canvas is normalized to 1.0 x 1.0
FoldTree (fun ((name,value),b) l r trans ->
// current node in top half, centered left-to-right
let tb = new TextBox(Width=100.0, Height=100.0, FontSize=30.0, Text=sprintf "%s:%d" name value,
// the tree is a "diff tree" where the bool represents
// "ReferenceEquals" differences, so color diffs Red
Foreground=(if b then Brushes.Black else Brushes.Red),
HorizontalContentAlignment=HorizontalAlignment.Center,
VerticalContentAlignment=VerticalAlignment.Center)
tb.RenderTransform <- IdentT() |> ScaleT 0.005 0.005 |> TranslateT 0.25 0.0 |> AddT trans
canvas.Children.Add(tb) |> ignore
// left child in bottom-left quadrant
l (IdentT() |> ScaleT 0.5 0.5 |> TranslateT 0.0 0.5 |> AddT trans)
// right child in bottom-right quadrant
r (IdentT() |> ScaleT 0.5 0.5 |> TranslateT 0.5 0.5 |> AddT trans)
) (fun _ -> ()) tree (IdentT())
let TreeToCanvas tree =
let canvas = new Canvas(Width=1.0, Height=1.0, Background = Brushes.Blue,
LayoutTransform=new ScaleTransform(400.0, 400.0))
Draw canvas tree
canvas
let TitledControl title control =
let grid = new Grid()
grid.ColumnDefinitions.Add(new ColumnDefinition())
grid.RowDefinitions.Add(new RowDefinition())
grid.RowDefinitions.Add(new RowDefinition())
let text = new TextBlock(Text = title, HorizontalAlignment = HorizontalAlignment.Center)
Grid.SetRow(text, 0)
Grid.SetColumn(text, 0)
grid.Children.Add(text) |> ignore
Grid.SetRow(control, 1)
Grid.SetColumn(control, 0)
grid.Children.Add(control) |> ignore
grid
let HorizontalGrid (controls:_[]) =
let grid = new Grid()
grid.RowDefinitions.Add(new RowDefinition())
for i in 0..controls.Length-1 do
let c = controls.[i]
grid.ColumnDefinitions.Add(new ColumnDefinition())
Grid.SetRow(c, 0)
Grid.SetColumn(c, i)
grid.Children.Add(c) |> ignore
grid
type MyWPFWindow(content, title) as this =
inherit Window()
do
this.Content <- content
this.Title <- title
this.SizeToContent <- SizeToContent.WidthAndHeight
[<System.STAThread()>]
do
let app = new Application()
let controls = [|
TitledControl "Original" (TreeToCanvas(DiffTree(origTree,origTree)))
TitledControl "D steals 10" (TreeToCanvas(DiffTree(origTree,dSteals10)))
TitledControl "F steals 30" (TreeToCanvas(DiffTree(origTree,fSteals30))) |]
app.Run(new MyWPFWindow(HorizontalGrid controls, "Fun with trees")) |> ignore
I guess if you start your sentence with " I have to modify the nodes of a tree, modifying at the same time the states at two different levels" then you're not really tackling your problem in a functional way. It's like writing a paper in a foreign language by first writing it in your mother tongue, then trying to translate. Doesn't really work. I know it hurts, but in my opinion it's better to immerse yourself completely. Don't worry about comparing the approaches just yet.
One way I've found to learn "the functional way" is to look at (and implement yourself!) some functional pearls. They're basically well document uber-functional elegant progams to solve a variety of problems. Start with the older ones, and don't be afraid to stop reading and try another one if you don't get it. Just come back to it later with renewed enthousiasm and more experience. It helps :)
What do you think about this "tension"
which I am experiencing? Am I
interpreting the "never mutate data"
too strictly? Could you suggest me
some resource?
In my opinion, if you're learning functional programming for the first time, its best to start out with zero mutable state. Otherwise, you'll only end up falling back on mutable state as your first resort, and all of your F# code will be C# with a little different syntax.
Regarding data structures, some are easier to express in a functional style than others. Could you provide a description of how you're trying to modify your tree?
For now, I would recommend F# Wikibook's page on data structures to see how data structures are written in a functional style.
I've looked at SICP, at How to design
programs and have found a thesis by C.
Okasaki ("Purely functional data
structures")
I personally found Okasaki's book more readable than the thesis online.
I have to modify nodes of a tree.
No you don't. That's your problem right there.
This is costing me a lot of effort
This is typical. It's not easy to learn to program with immutable data structures. And to most beginners, it seems unnatural at first. It's doubly difficult because HTDP and SICP don't give you good models to follow (see footnote).
I thought that you could give me some advice, surely my idea of "good functional programming design" is not yet very well defined.
We can, but you have to tell us what the problem is. Then many people on this forum can tell you if it is the sort of problem whose solution can be expressed in a clear way without resorting to mutation. Most tree problems can. But with the information you've given us, there's no way for us to tell.
Am I interpreting the "never mutate data" too strictly?
Not strictly enough, I'd say.
Please post a question indicating what problem you're trying to solve.
Footnote: both HTDP and SICP are done in Scheme, which lacks pattern matching. In this setting it is much harder to understand tree-manipulation code than it is using the pattern matching provided by F#. As far as I'm concerned, pattern matching is an essential feature for writing clear code in a purely functional style. For resources, you might consider Graham Hutton's new book on Programming in Haskell.
Take a look at the Zipper data structure.
For instance I have to modify the nodes of a tree, modifying at the same time the states at two different levels (L and L+1)
Why? In a functional language, you'd create a new tree instead. It can reuse the subtrees which don't need to be modified, and just plug them into a newly created root. "Don't mutate data" doesn't mean "try to mutate data without anyone noticing, and by adding so many helper methods that no one realize that this is what you're doing".
It means "don't mutate your data. Create new copies instead, which are initialized with the new, correct, values".