How can I filter out consecutive repeats in a sequence? - sequence

I'm trying to generate a Seq of random options where consecutive repeats are not allowed:
> (<F R U>.roll(*).grep(* ne *))[^10]
((R F) (F U) (R F) (F R) (U R) (R F) (R F) (U F) (R U) (U R))
Why does using grep in this way result in nested pairs? And how can I achieve what I'm looking for? Using .flat above will still allow consecutive repeats.
( R U F U F U R F R U ... )

how can I achieve what I'm looking for?
I believe the squish method will do what you want:
say <F R U>.roll(*).squish.head(20)
This produces:
(U R F U F R F R F U F R U R F U R U R F)
Why does using grep in this way result in nested pairs?
Because grep (and map too) on something with an arity greater than 1 will work by breaking the list into chunks of that size. Thus given A B C D, the first call to the grep block gets A B, the second C D, and so on. What's really needed in this case, however, is a lagged value. One could use a state variable in the block passed to grep to achieve this, however for this problem squish is a much simpler solution.

You could employ a simple regex using the <same> zero-width matcher. Below in the Raku REPL:
> my $a = (<F R U>.roll(*))[^20].comb(/\w/).join; say $a;
RRFUURFFFFRRFRFUFUUU
> say $/ if $a ~~ m:g/ \w <!same> /;
(「R」 「F」 「U」 「R」 「F」 「R」 「F」 「R」 「F」 「U」 「F」 「U」)
https://docs.raku.org/language/regexes#Predefined_Regexes

Related

What does \m mean in expression "l = \m -> ..."?

semOp l = \m -> case l of
LD g -> case m of
St xs -> St (g::xs)
_ -> Error
Just want to know what does the \m part is doing here.
\m -> ... is syntax for anonymous function, aka "lambda-expression".
For example, the following two declarations would be equivalent:
f x = x + 5
f = \x -> x + 5
Both define a function with one parameter x that returns a number greater than x by 5.
I wanted to add to Fyodor’s answer by saying that this is a slightly unusual way to write this - it’s making explicit that semOp l returns a lambda. The following are all valid and equivalent ways to write this function’s beginning:
semOp l m = case l of ..
semOp l = \m -> case l of ..
semOp = \l -> \m -> case l of ..
semOp = \l m -> case l of ..
You might be most used to the first version, but in some languages like Haskell these can have slightly difference performance impacts - but this isn’t something for you to worry about, particularly in Elm as far as I understand.

Construct a grammar for a language

I have a question regarding this question:
L= empty where the alphabet is {a,b}
how to create a grammar for this ? how can be the production rule ?
thanks in advance
A grammar G is an ordered 4-tuple {S, N, E, e, P} where:
N is a set of non-terminal symbols
E is a set of terminal symbols
N and E are disjoint
E is a superset of the alphabet of L(G)
e is the empty string
P is a set of ordered pairs of elements of (N U E U e); that is, P is a subset of (N U E U e) X (N U E U e)*.
S, the start symbol, is in N
A derivation in G is a sequence of elements of (N U E U e)* such that:
The first element is S
Adjacent elements w[i] and w[i+1] can be written as w[i] = uxv and w[i+1] = uyv such that (x, y) is in P
If there is a derivation in G whose last element is a string w[n] over (E U e)*, we say G generates w[n]; that is, w[n] is in L(G).
Now, we want to define a grammar G such that L(G) is the empty set. We fix the alphabet E = {a, b}. We must still define:
N, the set of nonterminals
S, the start symbol
P, the productions
We might as well take S as our start symbol. So N contains at least S; N is a superset of {S}. We will only add more nonterminals if we determine we need them. Let us turn our attention to the condition that L(G) is empty.
If L(G) is empty, that means there is no derivation in G that leads to a string of only terminal symbols. We can accomplish this easily be ensuring all our productions produce at least one nonterminal with any terminal. Or produce no terminals at all. So the following grammars would all work:
S := S
or
S := aSb
or
S := aXb | XXSSX
X := aabbXbbaaS
etc. All of these grammars have L(G) empty since none of them can derive a string of nonterminals.

vector reflexivity under setoid equality using CoRN MathClasses

I have a simple lemma:
Lemma map2_comm: forall A (f:A->A->B) n (a b:t A n),
(forall x y, (f x y) = (f y x)) -> map2 f a b = map2 f b a.
which I was able to prove using standard equality (≡). Now I am need to prove the similar lemma using setoid equality (using CoRN MathClasses). I am new to this library and type classes in general and having difficulty doing so. My first attempt is:
Lemma map2_setoid_comm `{Equiv B} `{Equiv (t B n)} `{Commutative B A}:
forall (a b: t A n),
map2 f a b = map2 f b a.
Proof.
intros.
induction n.
dep_destruct a.
dep_destruct b.
simpl.
(here '=' is 'equiv'). After 'simpl' the goal is "(nil B)=(nil B)" or "[]=[]" using VectorNotations. Normally I would finish it using 'reflexivity' tactics but it gives me:
Tactic failure: The relation equiv is not a declared reflexive relation. Maybe you need to require the Setoid library.
I guess I need somehow to define reflexivity for vector types, but I am not sure how to do that. Please advise.
First of all the lemma definition needs to be adjusted to:
Lemma map2_setoid_comm : forall `{CO:Commutative B A f} `{SB: !Setoid B} ,
forall n:nat, Commutative (map2 f (n:=n)).
To be able to use reflexivity:
Definition vec_equiv `{Equiv A} {n}: relation (vector A n) := Vforall2 (n:=n) equiv.
Instance vec_Equiv `{Equiv A} {n}: Equiv (vector A n) := vec_equiv.

Optimizing partial computation in Haskell

I'm curious how to optimize this code :
fun n = (sum l, f $ f0 l, g $ g0 l)
where l = map h [1..n]
Assuming that f, f0, g, g0, and h are all costly, but the creation and storage of l is extremely expensive.
As written, l is stored until the returned tuple is fully evaluated or garbage collected. Instead, length l, f0 l, and g0 l should all be executed whenever any one of them is executed, but f and g should be delayed.
It appears this behavior could be fixed by writing :
fun n = a `seq` b `seq` c `seq` (a, f b, g c)
where
l = map h [1..n]
a = sum l
b = inline f0 $ l
c = inline g0 $ l
Or the very similar :
fun n = (a,b,c) `deepSeq` (a, f b, g c)
where ...
We could perhaps specify a bunch of internal types to achieve the same effects as well, which looks painful. Are there any other options?
Also, I'm obviously hoping with my inlines that the compiler fuses sum, f0, and g0 into a single loop that constructs and consumes l term by term. I could make this explicit through manual inlining, but that'd suck. Are there ways to explicitly prevent the list l from ever being created and/or compel inlining? Pragmas that produce warnings or errors if inlining or fusion fail during compilation perhaps?
As an aside, I'm curious about why seq, inline, lazy, etc. are all defined to by let x = x in x in the Prelude. Is this simply to give them a definition for the compiler to override?
If you want to be sure, the only way is to do it yourself. For any given compiler version, you can try out several source-formulations and check the generated core/assembly/llvm byte-code/whatever whether it does what you want. But that could break with each new compiler version.
If you write
fun n = a `seq` b `seq` c `seq` (a, f b, g c)
where
l = map h [1..n]
a = sum l
b = inline f0 $ l
c = inline g0 $ l
or the deepseq version thereof, the compiler might be able to merge the computations of a, b and c to be performed in parallel (not in the concurrency sense) during a single traversal of l, but for the time being, I'm rather convinced that GHC doesn't, and I'd be surprised if JHC or UHC did. And for that the structure of computing b and c needs to be simple enough.
The only way to obtain the desired result portably across compilers and compiler versions is to do it yourself. For the next few years, at least.
Depending on f0 and g0, it might be as simple as doing a strict left fold with appropriate accumulator type and combining function, like the famous average
data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !Double
average :: [Double] -> Double
average = ratio . foldl' count (P 0 0)
where
ratio (P n s) = s / fromIntegral n
count (P n s) x = P (n+1) (s+x)
but if the structure of f0 and/or g0 doesn't fit, say one's a left fold and the other a right fold, it may be impossible to do the computation in one traversal. In such cases, the choice is between recreating l and storing l. Storing l is easy to achieve with explicit sharing (where l = map h [1..n]), but recreating it may be difficult to achieve if the compiler does some common subexpression elimination (unfortunately, GHC does have a tendency to share lists of that form, even though it does little CSE). For GHC, the flags fno-cse and -fno-full-laziness can help avoiding unwanted sharing.

Haskell Heap Issues with Parameter Passing Style

Here's a simple program that blows my heap to Kingdom Come:
intersect n k z s rs c
| c == 23 = rs
| x == y = intersect (n+1) (k+1) (z+1) (z+s) (f : rs) (c+1)
| x < y = intersect (n+1) k (z+1) s rs c
| otherwise = intersect n (k+1) z s rs c
where x = (2*n*n) + 4 * n
y = (k * k + k )
f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0 [] 0
main = do
putStr (show p)
What the program does is calculate the intersection of two infinite series, stopping when it reaches 23 elements. But that's not important to me.
What's interesting is that as far as I can tell, there shouldn't be much here that is sitting on the heap. The function intersect is recursives with all recursions written as tail calls. State is accumulated in the arguments, and there is not much of it. 5 integers and a small list of tuples.
If I were a betting person, I would bet that somehow thunks are being built up in the arguments as I do the recursion, particularly on arguments that aren't evaluated on a given recursion. But that's just a wild hunch.
What's the true problem here? And how does one fix it?
If you have a problem with the heap, run the heap profiler, like so:
$ ghc -O2 --make A.hs -prof -auto-all -rtsopts -fforce-recomp
[1 of 1] Compiling Main ( A.hs, A.o )
Linking A.exe ...
Which when run:
$ ./A.exe +RTS -M1G -hy
Produces an A.hp output file:
$ hp2ps -c A.hp
Like so:
So your heap is full of Integer, which indicates some problem in the accumulating parameters of your functions -- where all the Integers are.
Modifying the function so that it is strict in the lazy Integer arguments (based on the fact you never inspect their value), like so:
{-# LANGUAGE BangPatterns #-}
intersect n k !z !s rs c
| c == 23 = rs
| x == y = intersect (n+1) (k+1) (z+1) (z+s) (f : rs) (c+1)
| x < y = intersect (n+1) k (z+1) s rs c
| otherwise = intersect n (k+1) z s rs c
where x = (2*n*n) + 4 * n
y = (k * k + k )
f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0 [] 0
main = do
putStr (show p)
And your program now runs in constant space with the list of arguments you're producing (though doesn't terminate for c == 23 in any reasonable time).
If it is OK to get the resulting list reversed, you can take advantage of Haskell's laziness and return the list as it is computed, instead of passing it recursively as an accumulating argument. Not only does this let you consume and print the list as it is being computed (thereby eliminating one space leak right there), you can also factor out the decision about how many elements you want from intersect:
{-# LANGUAGE BangPatterns #-}
intersect n k !z s
| x == y = f : intersect (n+1) (k+1) (z+1) (z+s)
| x < y = intersect (n+1) k (z+1) s
| otherwise = intersect n (k+1) z s
where x = (2*n*n) + 4 * n
y = (k * k + k )
f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0
main = do
putStrLn (unlines (map show (take 23 p)))
As Don noted, we need to be careful so that accumulating arguments evaluate timely instead of building up big thunks. By making the argument z strict we ensure that all arguments will be demanded.
By outputting one element per line, we can watch the result being produced:
$ ghc -O2 intersect.hs && ./intersect
[1 of 1] Compiling Main ( intersect.hs, intersect.o )
Linking intersect ...
(1,3,1)
(3,15,4)
(10,120,14)
(22,528,36)
(63,4095,99)
(133,17955,232)
(372,139128,604)
(780,609960,1384)
...