How can I use a non-caching infinite lazy list in Perl 6 - raku

Infinite lazy lists are awesome!
> my #fibo = 0, 1, *+* ... *;
> say #fibo[1000];
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
They automatically cache their values, which is handy ... most of the time.
But when working with huge Fibonacci numbers (example), this can cause memory issues.
Unfortunately, I can't figure out how to create a non-caching Fibonacci sequence. Anyone?

One major problem is you are storing it in an array, which of course keeps all of its values.
The next problem is a little more subtle, the dotty sequence generator syntax LIST, CODE ... END doesn't know how many of the previous values the CODE part is going to ask for, so it keeps all of them.
( It could look at the arity/count of the CODE, but it doesn't currently seem to from experiments at the REPL )
Then there is the problem that using &postcircumfix:<[ ]> on a Seq calls .cache on the assumption that you are going to ask for another value at some point.
( From looking at the source for Seq.AT-POS )
It's possible that a future implementation could be better at each of these drawbacks.
You could create the sequence using a different feature to get around the current limitations of the dotty sequence generator syntax.
sub fibonacci-seq (){
gather {
take my $a = 0;
take my $b = 1;
loop {
take my $c = $a + $b;
$a = $b;
$b = $c;
}
}.lazy
}
If you are just iterating through the values you can just use it as is.
my $v;
for fibonacci-seq() {
if $_ > 1000 {
$v = $_;
last;
}
}
say $v;
my $count = 100000;
for fibonacci-seq() {
if $count-- <= 0 {
$v = $_;
last;
}
}
say chars $v; # 20899
You could also use the Iterator directly. Though this isn't necessary in most circumstances.
sub fibonacci ( UInt $n ) {
# have to get a new iterator each time this is called
my \iterator = fibonacci-seq().iterator;
for ^$n {
return Nil if iterator.pull-one =:= IterationEnd;
}
my \result = iterator.pull-one;
result =:= IterationEnd ?? Nil !! result
}
If you have a recent enough version of Rakudo you can use skip-at-least-pull-one.
sub fibonacci ( UInt $n ) {
# have to get a new iterator each time this is called
my \result = fibonacci-seq().iterator.skip-at-least-pull-one($n);
result =:= IterationEnd ?? Nil !! result
}
You can also implement the Iterator class directly, wrapping it in a Seq.
( this is largely how methods that return sequences are written in the Rakudo core )
sub fibonacci-seq2 () {
Seq.new:
class :: does Iterator {
has Int $!a = 0;
has Int $!b = 1;
method pull-one {
my $current = $!a;
my $c = $!a + $!b;
$!a = $!b;
$!b = $c;
$current;
}
# indicate that this should never be eagerly iterated
# which is recommended on infinite generators
method is-lazy ( --> True ) {}
}.new
}

Apparently, a noob cannot comment.
When defining a lazy iterator such as sub fibonacci-seq2, one should mark the iterator as lazy by adding a "is-lazy" method that returns True, e.g.:
method is-lazy(--> True) { }
This will allow the system to detect possible infiniloops better.

Related

how to make a context aware code evaluator

I was looking at REPL-like evaluation of code from here and here, and tried to make a very small version for it, yet it fails:
use nqp;
class E {
has Mu $.compiler;
has $!save_ctx;
method evaluate(#fragments) {
for #fragments -> $code {
my $*MAIN_CTX;
my $*CTXSAVE := self;
$!compiler.eval($code,
outer_ctx => nqp::ctxcaller(nqp::ctx()));
if nqp::defined($*MAIN_CTX) {
$!save_ctx := $*MAIN_CTX;
}
}
}
method ctxsave(--> Nil) {
say "*in ctxsave*";
$*MAIN_CTX := nqp::ctxcaller(nqp::ctx());
$*CTXSAVE := 0;
}
}
my $e := E.new(compiler => nqp::getcomp("Raku"));
nqp::bindattr($e, E, '$!save_ctx', nqp::ctx());
$e.evaluate: ('say my #vals = 12, 3, 4;', 'say #vals.head');
I pieced together this from the above links without very much knowing what I'm doing :) When run, this happens:
*in ctxsave*
[12 3 4]
===SORRY!=== Error while compiling file.raku
Variable '#vals' is not declared. Did you mean '&val'?
file.raku:1
------> say ⏏#vals.head
with Rakudo v2022.04. First fragment was supposed to declare it (and prints it). Is it possible to do something like this, so it recognizes #vals as declared?
You can do it in pure Raku code, although depending on the not-exactly-official context parameter to EVAL.
# Let us use EVAL with user input
use MONKEY;
loop {
# The context starts out with a fresh environment
state $*REPL-CONTEXT = UNIT::;
# Get the next line of code to run.
my $next-code = prompt '> ';
# Evaluate it; note that exceptions with line numbers will be
# off by one, so may need fixups.
EVAL "\q'$*REPL-CONTEXT = ::;'\n$next-code", context => $*REPL-CONTEXT;
}
Trying it out:
$ raku simple-repl.raku
> my $x = 35;
> say $x;
35
> my $y = 7;
> say $x + $y;
42

IIFE alternatives in Raku

In Ruby I can group together some lines of code like so with a begin block:
x = begin
puts "Hi!"
a = 2
b = 3
a + b
end
puts x # 5
it's immediately evaluated and its value is the last value of the block (a + b here) (Javascripters do a similar thing with IIFEs)
What are the ways to do this in Raku? Is there anything smoother than:
my $x = ({
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
})();
say $x; # 5
Insert a do in front of the block. This tells Raku to:
Immediately do whatever follows the do on its right hand side;
Return the value to the do's left hand side:
my $x = do {
put "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
That said, one rarely needs to use do.
Instead, there are many other IIFE forms in Raku that just work naturally without fuss. I'll mention just two because they're used extensively in Raku code:
with whatever { .foo } else { .bar }
You might think I'm being silly, but those are two IIFEs. They form lexical scopes, have parameter lists, bind from arguments, the works. Loads of Raku constructs work like that.
In the above case where I haven't written an explicit parameter list, this isn't obvious. The fact that .foo is called on whatever if whatever is defined, and .bar is called on it if it isn't, is both implicit and due to the particular IIFE calling behavior of with.
See also if, while, given, and many, many more.
What's going on becomes more obvious if you introduce an explicit parameter list with ->:
for whatever -> $a, $b { say $a + $b }
That iterates whatever, binding two consecutive elements from it to $a and $b, until whatever is empty. If it has an odd number of elements, one might write:
for whatever -> $a, $b? { say $a + $b }
And so on.
Bottom line: a huge number of occurrences of {...} in Raku are IIFEs, even if they don't look like it. But if they're immediately after an =, Raku defaults to assuming you want to assign the lambda rather than immediately executing it, so you need to insert a do in that particular case.
Welcome to Raku!
my $x = BEGIN {
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
I guess the common ancestry of Raku and Ruby shows :-)
Also note that to create a constant, you can also use constant:
my constant $x = do {
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
If you can have a single statement, you can leave off the braces:
my $x = BEGIN 2 + 3;
or:
my constant $x = 2 + 3;
Regarding blocks: if they are in sink context (similar to "void" context in some languages), then they will execute just like that:
{
say "Executing block";
}
No need to explicitely call it: it will be called for you :-)

Why does a Perl 6 Str do the Positional role, and how can I change []?

I'm playing around with a positional interface for strings. I'm aware of How can I slice a string like Python does in Perl 6?, but I was curious if I could make this thing work just for giggles.
I came up with this example. Reading positions is fine, but I don't know how to set up the multi to handle an assignment:
multi postcircumfix:<[ ]> ( Str:D $s, Int:D $n --> Str ) {
$s.substr: $n, 1
}
multi postcircumfix:<[ ]> ( Str:D $s, Range:D $r --> Str ) {
$s.substr: $r.min, $r.max - $r.min + 1
}
multi postcircumfix:<[ ]> ( Str:D $s, List:D $i --> List ) {
map( { $s.substr: $_, 1 }, #$i ).list
}
multi postcircumfix:<[ ]> ( Str:D $s, Int:D $n, *#a --> Str ) is rw {
put "Calling rw version";
}
my $string = 'The quick, purple butterfly';
{ # Works
my $single = $string[0];
say $single;
}
{ # Works
my $substring = $string[5..9];
say $substring;
}
{ # Works
my $substring = $string[1,3,5,7];
say $substring;
}
{ # NOPE!
$string[2] = 'Perl';
say $string;
}
The last one doesn't work:
T
uick,
(h u c)
Index out of range. Is: 2, should be in 0..0
in block <unit> at substring.p6 line 36
Actually thrown at:
in block <unit> at substring.p6 line 36
I didn't think it would work, though. I don't know what signature or traits it should have to do what I want to do.
Why does the [] operator work on a Str?
$ perl6
> "some string"[0]
some string
The docs mostly imply that the [] works on things that do the Positional roles and that those things are in list like things. From the [] docs in operators:
Universal interface for positional access to zero or more elements of a #container, a.k.a. "array indexing operator".
But a Str surprisingly does the necessary role even though it's not an #container (as far as I know):
> "some string".does( 'Positional' )
True
Is there a way to test that something is an #container?
Is there a way to get something to list all of its roles?
Now, knowing that a string can respond to the [], how can I figure out what signature will match that? I want to know the right signature to use to define my own version to write to this string through [].
One way to achieve this, is by augmenting the Str class, since you really only need to override the AT-POS method (which Str normally inherits from Any):
use MONKEY;
augment class Str {
method AT-POS($a) {
self.substr($a,1);
}
}
say "abcde"[3]; # d
say "abcde"[^3]; # (a b c)
More information can be found here: https://docs.raku.org/language/subscripts#Methods_to_implement_for_positional_subscripting
To make your rw version work correctly, you first need to make the Str which might get mutated also rw, and it needs to return something which in turn is also rw. For the specific case of strings, you could simply do:
multi postcircumfix:<[ ]> ( Str:D $s is rw, Int:D $i --> Str ) is rw {
return $s.substr-rw: $i, 1;
}
Quite often, you'll want an rw subroutine to return an instance of Proxy:
multi postcircumfix:<[ ]> ( Str:D $s is rw, Int:D $i --> Str ) is rw {
Proxy.new: FETCH => sub { $s.substr: $i },
STORE => sub -> $newval { $s.substr-rw( $i, 1 ) = $newval }
}
Although I haven't yet seen production code which uses it, there is also a return-rw operator, which you'll occasionally need instead of return.
sub identity( $x is rw ) is rw { return-rw $x }
identity( my $y ) = 42; # Works, $y is 42.
sub identity-fail( $x is rw ) is rw { return $x }
identity-fail( my $z ) = 42; # Fails: "Cannot assign to a readonly variable or a value"
If a function reaches the end without executing a return, return-rw or throwing an exception, the value of the last statement is returned, and (at present), this behaves as if it were preceded return-rw.
sub identity2( $x is rw ) is rw { $x }
identity2( my $w ) = 42; # Works, $w is 42.
There's a module that aims to let you do this:
https://github.com/zoffixznet/perl6-Pythonic-Str
However:
This module does not provide Str.AT-POS or make Str type do Positional or Iterable roles. The latter causes all sorts of fallout with core and non-core code due to inherent assumptions that Str type does not do those roles. What this means in plain English is you can only index your strings with [...] postcircumfix operator and can't willy-nilly treat them as lists of characters—simply call .comb if you need that.`

How does one write custom accessor methods in Perl6?

How does one write custom accessor methods in Perl6?
If I have this class:
class Wizard {
has Int $.mana is rw;
}
I can do this:
my Wizard $gandalf .= new;
$gandalf.mana = 150;
Let's say I want to add a little check to a setter in my Perl6 class without giving up the $gandalf.mana = 150; notation (in other words, I don't want to write this: $gandalf.setMana(150);). The program should die, if it tries to set a negative mana. How do I do this? The Perl6 documentation just mentions it is possible to write custom accessors, but does not say how.
With more recent versions of Rakudo there is a subset named UInt that restricts it to positive values.
class Wizard {
has UInt $.mana is rw;
}
So that you're not stuck in a lurch if you need to something like this; here is how that is defined:
( you can leave off the my, but I wanted to show you the actual line from the Rakudo source )
my subset UInt of Int where * >= 0;
You could also do this:
class Wizard {
has Int $.mana is rw where * >= 0;
}
I would like to point out that the * >= 0 in the where constraint is just a short way to create a Callable.
You could have any of the following as a where constraint:
... where &subroutine # a subroutine that returns a true value for positive values
... where { $_ >= 0 }
... where -> $a { $a >= 0 }
... where { $^a >= 0 }
... where $_ >= 0 # statements also work ( 「$_」 is set to the value it's testing )
( If you wanted it to just not be zero you could also use ... where &prefix:<?> which is probably better spelled as ... where ?* or ... where * !== 0 )
If you feel like being annoying to people using your code you could also do this.
class Wizard {
has UInt $.mana is rw where Bool.pick; # accepts changes randomly
}
If you want to make sure the value "makes sense" when looking at all of the values in the class in aggregate, you will have to go to a lot more work.
( It may require a lot more knowledge of the implementation as well )
class Wizard {
has Int $.mana; # use . instead of ! for better `.perl` representation
# overwrite the method the attribute declaration added
method mana () is rw {
Proxy.new(
FETCH => -> $ { $!mana },
STORE => -> $, Int $new {
die 'invalid mana' unless $new >= 0; # placeholder for a better error
$!mana = $new
}
)
}
}
You can get the same accessor interface that saying $.mana provides by declaring a method is rw. Then you can wrap a proxy around the underlying attribute like so:
#!/usr/bin/env perl6
use v6;
use Test;
plan 2;
class Wizard {
has Int $!mana;
method mana() is rw {
return Proxy.new:
FETCH => sub ($) { return $!mana },
STORE => sub ($, $mana) {
die "It's over 9000!" if ($mana // 0) > 9000;
$!mana = $mana;
}
}
}
my Wizard $gandalf .= new;
$gandalf.mana = 150;
ok $gandalf.mana == 150, 'Updating mana works';
throws_like sub {
$gandalf.mana = 9001;
}, X::AdHoc, 'Too much mana is too much';
Proxy is basically a way to intercept read and write calls to storage and do something other than the default behavior. As their capitalization suggests, FETCH and STORE are called automatically by Perl to resolve expressions like $gandalf.mana = $gandalf.mana + 5.
There's a fuller discussion, including whether you should even attempt this, at PerlMonks. I would recommend against the above -- and public rw attributes in general. It's more a display of what it is possible to express in the language than a useful tool.

How do I write a generic memoize function?

I'm writing a function to find triangle numbers and the natural way to write it is recursively:
function triangle (x)
if x == 0 then return 0 end
return x+triangle(x-1)
end
But attempting to calculate the first 100,000 triangle numbers fails with a stack overflow after a while. This is an ideal function to memoize, but I want a solution that will memoize any function I pass to it.
Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:
triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]
That's it. It works because the rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.
Of course, as has been pointed out, this example has a closed-form solution: triangle[x_] := x*(x+1)/2. Fibonacci numbers are the classic example of how adding memoization gives a drastic speedup:
fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
Although that too has a closed-form equivalent, albeit messier: http://mathworld.wolfram.com/FibonacciNumber.html
I disagree with the person who suggested this was inappropriate for memoization because you could "just use a loop". The point of memoization is that any repeat function calls are O(1) time. That's a lot better than O(n). In fact, you could even concoct a scenario where the memoized implementation has better performance than the closed-form implementation!
You're also asking the wrong question for your original problem ;)
This is a better way for that case:
triangle(n) = n * (n - 1) / 2
Furthermore, supposing the formula didn't have such a neat solution, memoisation would still be a poor approach here. You'd be better off just writing a simple loop in this case. See this answer for a fuller discussion.
I bet something like this should work with variable argument lists in Lua:
local function varg_tostring(...)
local s = select(1, ...)
for n = 2, select('#', ...) do
s = s..","..select(n,...)
end
return s
end
local function memoize(f)
local cache = {}
return function (...)
local al = varg_tostring(...)
if cache[al] then
return cache[al]
else
local y = f(...)
cache[al] = y
return y
end
end
end
You could probably also do something clever with a metatables with __tostring so that the argument list could just be converted with a tostring(). Oh the possibilities.
In C# 3.0 - for recursive functions, you can do something like:
public static class Helpers
{
public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>, R> f)
{
var map = new Dictionary<A, R>();
Func<A, R> self = null;
self = (a) =>
{
R value;
if (map.TryGetValue(a, out value))
return value;
value = f(a, self);
map.Add(a, value);
return value;
};
return self;
}
}
Then you can create a memoized Fibonacci function like this:
var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));
In Scala (untested):
def memoize[A, B](f: (A)=>B) = {
var cache = Map[A, B]()
{ x: A =>
if (cache contains x) cache(x) else {
val back = f(x)
cache += (x -> back)
back
}
}
}
Note that this only works for functions of arity 1, but with currying you could make it work. The more subtle problem is that memoize(f) != memoize(f) for any function f. One very sneaky way to fix this would be something like the following:
val correctMem = memoize(memoize _)
I don't think that this will compile, but it does illustrate the idea.
Update: Commenters have pointed out that memoization is a good way to optimize recursion. Admittedly, I hadn't considered this before, since I generally work in a language (C#) where generalized memoization isn't so trivial to build. Take the post below with that grain of salt in mind.
I think Luke likely has the most appropriate solution to this problem, but memoization is not generally the solution to any issue of stack overflow.
Stack overflow usually is caused by recursion going deeper than the platform can handle. Languages sometimes support "tail recursion", which re-uses the context of the current call, rather than creating a new context for the recursive call. But a lot of mainstream languages/platforms don't support this. C# has no inherent support for tail-recursion, for example. The 64-bit version of the .NET JITter can apply it as an optimization at the IL level, which is all but useless if you need to support 32-bit platforms.
If your language doesn't support tail recursion, your best option for avoiding stack overflows is either to convert to an explicit loop (much less elegant, but sometimes necessary), or find a non-iterative algorithm such as Luke provided for this problem.
function memoize (f)
local cache = {}
return function (x)
if cache[x] then
return cache[x]
else
local y = f(x)
cache[x] = y
return y
end
end
end
triangle = memoize(triangle);
Note that to avoid a stack overflow, triangle would still need to be seeded.
Here's something that works without converting the arguments to strings.
The only caveat is that it can't handle a nil argument. But the accepted solution can't distinguish the value nil from the string "nil", so that's probably OK.
local function m(f)
local t = { }
local function mf(x, ...) -- memoized f
assert(x ~= nil, 'nil passed to memoized function')
if select('#', ...) > 0 then
t[x] = t[x] or m(function(...) return f(x, ...) end)
return t[x](...)
else
t[x] = t[x] or f(x)
assert(t[x] ~= nil, 'memoized function returns nil')
return t[x]
end
end
return mf
end
I've been inspired by this question to implement (yet another) flexible memoize function in Lua.
https://github.com/kikito/memoize.lua
Main advantages:
Accepts a variable number of arguments
Doesn't use tostring; instead, it organizes the cache in a tree structure, using the parameters to traverse it.
Works just fine with functions that return multiple values.
Pasting the code here as reference:
local globalCache = {}
local function getFromCache(cache, args)
local node = cache
for i=1, #args do
if not node.children then return {} end
node = node.children[args[i]]
if not node then return {} end
end
return node.results
end
local function insertInCache(cache, args, results)
local arg
local node = cache
for i=1, #args do
arg = args[i]
node.children = node.children or {}
node.children[arg] = node.children[arg] or {}
node = node.children[arg]
end
node.results = results
end
-- public function
local function memoize(f)
globalCache[f] = { results = {} }
return function (...)
local results = getFromCache( globalCache[f], {...} )
if #results == 0 then
results = { f(...) }
insertInCache(globalCache[f], {...}, results)
end
return unpack(results)
end
end
return memoize
Here is a generic C# 3.0 implementation, if it could help :
public static class Memoization
{
public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
{
var cache = new Dictionary<T, TResult>();
var nullCache = default(TResult);
var isNullCacheSet = false;
return parameter =>
{
TResult value;
if (parameter == null && isNullCacheSet)
{
return nullCache;
}
if (parameter == null)
{
nullCache = function(parameter);
isNullCacheSet = true;
return nullCache;
}
if (cache.TryGetValue(parameter, out value))
{
return value;
}
value = function(parameter);
cache.Add(parameter, value);
return value;
};
}
}
(Quoted from a french blog article)
In the vein of posting memoization in different languages, i'd like to respond to #onebyone.livejournal.com with a non-language-changing C++ example.
First, a memoizer for single arg functions:
template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
template <class F>
const Result& operator()(F f, const Arg& a){
typename ResultStore::const_iterator it = memo_.find(a);
if(it == memo_.end()) {
it = memo_.insert(make_pair(a, f(a))).first;
}
return it->second;
}
private:
ResultStore memo_;
};
Just create an instance of the memoizer, feed it your function and argument. Just make sure not to share the same memo between two different functions (but you can share it between different implementations of the same function).
Next, a driver functon, and an implementation. only the driver function need be public
int fib(int); // driver
int fib_(int); // implementation
Implemented:
int fib_(int n){
++total_ops;
if(n == 0 || n == 1)
return 1;
else
return fib(n-1) + fib(n-2);
}
And the driver, to memoize
int fib(int n) {
static memoizer1<int,int> memo;
return memo(fib_, n);
}
Permalink showing output on codepad.org. Number of calls is measured to verify correctness. (insert unit test here...)
This only memoizes one input functions. Generalizing for multiple args or varying arguments left as an exercise for the reader.
In Perl generic memoization is easy to get. The Memoize module is part of the perl core and is highly reliable, flexible, and easy-to-use.
The example from it's manpage:
# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments); # Is faster than it was before
You can add, remove, and customize memoization of functions at run time! You can provide callbacks for custom memento computation.
Memoize.pm even has facilities for making the memento cache persistent, so it does not need to be re-filled on each invocation of your program!
Here's the documentation: http://perldoc.perl.org/5.8.8/Memoize.html
Extending the idea, it's also possible to memoize functions with two input parameters:
function memoize2 (f)
local cache = {}
return function (x, y)
if cache[x..','..y] then
return cache[x..','..y]
else
local z = f(x,y)
cache[x..','..y] = z
return z
end
end
end
Notice that parameter order matters in the caching algorithm, so if parameter order doesn't matter in the functions to be memoized the odds of getting a cache hit would be increased by sorting the parameters before checking the cache.
But it's important to note that some functions can't be profitably memoized. I wrote memoize2 to see if the recursive Euclidean algorithm for finding the greatest common divisor could be sped up.
function gcd (a, b)
if b == 0 then return a end
return gcd(b, a%b)
end
As it turns out, gcd doesn't respond well to memoization. The calculation it does is far less expensive than the caching algorithm. Ever for large numbers, it terminates fairly quickly. After a while, the cache grows very large. This algorithm is probably as fast as it can be.
Recursion isn't necessary. The nth triangle number is n(n-1)/2, so...
public int triangle(final int n){
return n * (n - 1) / 2;
}
Please don't recurse this. Either use the x*(x+1)/2 formula or simply iterate the values and memoize as you go.
int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
sum+=i;
memo[i] = sum;
}
return memo[n];