Perl6: .sort() doesn't use overridden cmp - raku

According to the documentation, sort compares using infix:<cmp>.
But:
class Point
{
has Int $.x;
has Int $.y;
method Str { "($!x,$!y)" }
method gist { self.Str }
}
multi sub infix:<cmp>(Point $a, Point $b) { $a.y cmp $b.y || $a.x cmp $b.x }
my #p = Point.new(:3x, :2y), Point.new(:2x, :4y), Point.new(:1x, :1y);
say #p.sort;
gives output:
((1,1) (2,4) (3,2))
When I use:
say #p.sort(&infix:<cmp>);
it does give the proper output:
((1,1) (3,2) (2,4))
Is this a bug, a feature, or a flaw in the documentation?
And is there a way to make .sort() on a list of a Points use a custom sort order without specifying a routine?

I think that's a case of Broken By Design. Consider the following snippet:
my $a = Point.new(:3x, :2y);
my $b = Point.new(:2x, :4y);
say &infix:<cmp>.WHICH;
say $a cmp $b;
{
multi sub infix:<cmp>(Point $a, Point $b) { $a.y cmp $b.y || $a.x cmp $b.x }
say &infix:<cmp>.WHICH;
say $a cmp $b;
}
say &infix:<cmp>.WHICH;
say $a cmp $b;
The definition of the new multi candidate will generate a new proto sub that is only visible lexically. As the sort method is defined in the setting (conceptionally, an enclosing scope), it won't see your new multi candidate.
It might be possible to make sort look up &infix:<cmp> dynamically instead of lexically, though I suspect such a change would have to wait for 6.e even if we decided that's something we want to do, which isn't a given.
As a workaround, you could do something like
constant CMP = &infix:<cmp>;
multi sub infix:<cmp>(Point $a, Point $b) { ... }
BEGIN CMP.wrap(&infix:<cmp>);
for now, though I wouldn't necessarily recommend it (messing with global state considered harmful, and all that jazz)...

The cmp that is being used is the cmp that is in lexical scope within sort, not the one you have defined. If you change a few lines to:
multi sub infix:<cmp>(Point $a, Point $b) {
say "Hey $a $b";
$a.y cmp $b.y || $a.x cmp $b.x
}
my #p = Point.new(:3x, :2y), Point.new(:2x, :4y), Point.new(:1x, :1y);
say #p.sort( { $^b cmp $^a } );
Since the cmp that is being used is the one that is in actual lexical scope, it gets called correctly printing:
Hey (2,4) (3,2)
Hey (1,1) (2,4)
Hey (1,1) (3,2)
((2,4) (3,2) (1,1))
As was required.

Related

IIFE alternatives in Raku

In Ruby I can group together some lines of code like so with a begin block:
x = begin
puts "Hi!"
a = 2
b = 3
a + b
end
puts x # 5
it's immediately evaluated and its value is the last value of the block (a + b here) (Javascripters do a similar thing with IIFEs)
What are the ways to do this in Raku? Is there anything smoother than:
my $x = ({
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
})();
say $x; # 5
Insert a do in front of the block. This tells Raku to:
Immediately do whatever follows the do on its right hand side;
Return the value to the do's left hand side:
my $x = do {
put "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
That said, one rarely needs to use do.
Instead, there are many other IIFE forms in Raku that just work naturally without fuss. I'll mention just two because they're used extensively in Raku code:
with whatever { .foo } else { .bar }
You might think I'm being silly, but those are two IIFEs. They form lexical scopes, have parameter lists, bind from arguments, the works. Loads of Raku constructs work like that.
In the above case where I haven't written an explicit parameter list, this isn't obvious. The fact that .foo is called on whatever if whatever is defined, and .bar is called on it if it isn't, is both implicit and due to the particular IIFE calling behavior of with.
See also if, while, given, and many, many more.
What's going on becomes more obvious if you introduce an explicit parameter list with ->:
for whatever -> $a, $b { say $a + $b }
That iterates whatever, binding two consecutive elements from it to $a and $b, until whatever is empty. If it has an odd number of elements, one might write:
for whatever -> $a, $b? { say $a + $b }
And so on.
Bottom line: a huge number of occurrences of {...} in Raku are IIFEs, even if they don't look like it. But if they're immediately after an =, Raku defaults to assuming you want to assign the lambda rather than immediately executing it, so you need to insert a do in that particular case.
Welcome to Raku!
my $x = BEGIN {
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
I guess the common ancestry of Raku and Ruby shows :-)
Also note that to create a constant, you can also use constant:
my constant $x = do {
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
If you can have a single statement, you can leave off the braces:
my $x = BEGIN 2 + 3;
or:
my constant $x = 2 + 3;
Regarding blocks: if they are in sink context (similar to "void" context in some languages), then they will execute just like that:
{
say "Executing block";
}
No need to explicitely call it: it will be called for you :-)

Raku Ambiguous call to infix(Hyper: Dan::Series, Int)

I am writing a model Series class (kinda like the one in pandas) - and it should be both Positional and Associative.
class Series does Positional does Iterable does Associative {
has Array $.data is required;
has Array $.index;
### Construction ###
method TWEAK {
# sort out data-index dependencies
$!index = gather {
my $i = 0;
for |$!data -> $d {
take ( $!index[$i++] => $d )
}
}.Array
}
### Output ###
method Str {
$!index
}
### Role Support ###
# Positional role support
# viz. https://docs.raku.org/type/Positional
method of {
Mu
}
method elems {
$!data.elems
}
method AT-POS( $p ) {
$!data[$p]
}
method EXISTS-POS( $p ) {
0 <= $p < $!data.elems ?? True !! False
}
# Iterable role support
# viz. https://docs.raku.org/type/Iterable
method iterator {
$!data.iterator
}
method flat {
$!data.flat
}
method lazy {
$!data.lazy
}
method hyper {
$!data.hyper
}
# Associative role support
# viz. https://docs.raku.org/type/Associative
method keyof {
Str(Any)
}
method AT-KEY( $k ) {
for |$!index -> $p {
return $p.value if $p.key ~~ $k
}
}
method EXISTS-KEY( $k ) {
for |$!index -> $p {
return True if $p.key ~~ $k
}
}
#`[ solution attempt #1 does NOT get called
multi method infix(Hyper: Series, Int) is default {
die "I was called"
}
#]
}
my $s = Series.new(data => [rand xx 5], index => [<a b c d e>]);
say ~$s;
say $s[2];
say $s<b>;
So far pretty darn cool.
I can go dd $s.hyper and get this
HyperSeq.new(configuration => HyperConfiguration.new(batch => 64, degree => 1))
BUT (there had to be a but coming), I want to be able to do hyper math on my Series' elements, something like:
say $s >>+>> 2;
But that yields:
Ambiguous call to 'infix(Hyper: Dan::Series, Int)'; these signatures all match:
(Hyper: Associative:D \left, \right, *%_)
(Hyper: Positional:D \left, \right, *%_)
in block <unit> at ./synopsis-dan.raku line 63
How can I tell my class Series not to offer the Associative hyper candidate...?
Note: edited example to be a runnable MRE per #raiph's comment ... I have thus left in the minimum requirements for the 3 roles in play per docs.raku.org
After some experimentation (and new directions to consider from the very helpful comments to this SO along the way), I think I have found a solution:
drop the does Associative role from the class declaration like this:
class Series does Positional does Iterable {...}
BUT
leave the Associative role support methods in the body of the class:
# Associative role support
# viz. https://docs.raku.org/type/Associative
method keyof {
Str(Any)
}
method AT-KEY( $k ) {
for |$!index -> $p {
return $p.value if $p.key ~~ $k
}
}
method EXISTS-KEY( $k ) {
for |$!index -> $p {
return True if $p.key ~~ $k
}
}
This gives me the Positional and Associative accessors, and functional hyper math operators:
my $s = Series.new(data => [rand xx 5], index => [<a b c d e>]);
say ~$s; #([a => 0.6137271559776396 b => 0.7942959887386045 c => 0.5768018697817604 d => 0.8964323560788711 e => 0.025740150933493577] , dtype: Num)
say $s[2]; #0.7942959887386045
say $s<b>; #0.5768018697817604
say $s >>+>> 2; #(2.6137271559776396 2.7942959887386047 2.5768018697817605 2.896432356078871 2.0257401509334936)
While this feels a bit thin (and probably lacks the full set of Associative functions) I am fairly confident that the basic methods will give me slimmed down access like a hash from a key capability that I seek. And it no longer creates the ambiguous call.
This solution may be cheating a bit in that I know the level of compromise that I will accept ;-).
Take #1
First, an MRE with an emphasis on the M1:
class foo does Positional does Associative { method of {} }
sub infix:<baz> (\l,\r) { say 'baz' }
foo.new >>baz>> 42;
yields:
Ambiguous call to 'infix(Hyper: foo, Int)'; these signatures all match:
(Hyper: Associative:D \left, \right, *%_)
(Hyper: Positional:D \left, \right, *%_)
in block <unit> at ./synopsis-dan.raku line 63
The error message shows it's A) a call to a method named infix with an invocant matching Hyper, and B) there are two methods that potentially match that call.
Given that there's no class Hyper in your MRE, these methods and the Hyper class must be either built-ins or internal details that are leaking out.
A search of the doc finds no such class. So Hyper is undocumented Given that the doc has fairly broad coverage these days, this suggests Hyper is an internal detail. But regardless, it looks like you can't solve your problem using official/documented features.
Hopefully this bad news is still better than none.2
Take #2
Where's the fun in letting little details like "not an official feature" stop us doing what we want to do?
There's a core.c module named Hyper.pm6 in the Rakudo source repo.
A few seconds browsing that, and clicks on its History and Blame, and I can instantly see it really is time for me to conclude this SO answer, with a recommendation for your next move.
To wit, I suggest you start another SO, using this answer as its heart (but reversing my presentation order, ie starting by mentioning Hyper, and that it's not doc'd), and namechecking Liz (per Hyper's History/Blame), with a link back to your Q here as its background. I'm pretty sure that will get you a good answer, or at least an authoritative one.
Footnotes
1 I also tried this:
class foo does Positional does Associative { method of {} }
sub postfix:<bar>(\arg) { say 'bar' }
foo.new>>bar;
but that worked (displayed bar).
2 If you didn't get to my Take #1 conclusion yourself, perhaps that was was because your MRE wasn't very M? If you did arrive at the same point (cf "solution attempt #1 does NOT get called" in your MRE) then please read and, for future SOs, take to heart, the wisdom of "Explain ... any difficulties that have prevented you from solving it yourself".

(Identifier) terms vs. constants vs. null signature routines

Identifier terms are defined in the documentation alongside constants, with pretty much the same use case, although terms compute their value in run time while constants get it in compile time. Potentially, that could make terms use global variables, but that's action at a distance and ugly, so I guess that's not their use case.
OTOH, they could be simply routines with null signature:
sub term:<þor> { "Is mighty" }
sub Þor { "Is mighty" }
say þor, Þor;
But you can already define routines with null signature. You can sabe, however, the error when you write:
say Þor ~ Þor;
Which would produce a many positionals passed; expected 0 arguments but got 1, unlike the term. That seems however a bit farfetched and you can save the trouble by just adding () at the end.
Another possible use case is defying the rules of normal identifiers
sub term:<✔> { True }
say ✔; # True
Are there any other use cases I have missed?
Making zero-argument subs work as terms will break the possibility to post-declare subs, since finding a sub after having parsed usages of it would require re-parsing of earlier code (which the perl 6 language refuses to do, "one-pass parsing" and all that) if the sub takes no arguments.
Terms are useful in combination with the ternary operator:
$ perl6 -e 'sub a() { "foo" }; say 42 ?? a !! 666'
===SORRY!=== Error while compiling -e
Your !! was gobbled by the expression in the middle; please parenthesize
$ perl6 -e 'sub term:<a> { "foo" }; say 42 ?? a !! 666'
foo
Constants are basically terms. So of course they are grouped together.
constant foo = 12;
say foo;
constant term:<bar> = 36;
say bar;
There is a slight difference because term:<…> works by modifying the parser. So it takes precedence.
constant fubar = 38;
constant term:<fubar> = 45;
say fubar; # 45
The above will print 45 regardless of which constant definition comes first.
Since term:<…> takes precedence the only way to get at the other value is to use ::<fubar> to directly access the symbol table.
say ::<fubar>; # 38
say ::<term:<fubar>>; # 45
There are two main use-cases for term:<…>.
One is to get a subroutine to be parsed similarly to a constant or sigilless variable.
sub fubar () { 'fubar'.comb.roll }
# say( fubar( prefix:<~>( 4 ) ) );
say fubar ~ 4; # ERROR
sub term:<fubar> () { 'fubar'.comb.roll }
# say( infix:<~>( fubar, 4 ) );
say fubar ~ 4;
The other is to have a constant or sigiless variable be something other than an a normal identifier.
my \✔ = True; # ERROR: Malformed my
my \term:<✔> = True;
say ✔;
Of course both use-cases can be combined.
sub term:<✔> () { True }
Perl 5 allows subroutines to have an empty prototype (different than a signature) which will alter how it gets parsed. The main purpose of prototypes in Perl 5 is to alter how the code gets parsed.
use v5;
sub fubar () { ord [split('','fubar')]->[rand 5] }
# say( fubar() + 4 );
say fubar + 4; # infix +
use v5;
sub fubar { ord [split('','fubar')]->[rand 5] }
# say( fubar( +4 ) );
say fubar + 4; # prefix +
Perl 6 doesn't use signatures the way Perl 5 uses prototypes. The main way to alter how Perl 6 parses code is by using the namespace.
use v6;
sub fubar ( $_ ) { .comb.roll }
sub term:<fubar> () { 'fubar'.comb.roll }
say fubar( 'zoo' ); # `z` or `o` (`o` is twice as likely)
say fubar; # `f` or `u` or `b` or `a` or `r`
sub prefix:<✔> ( $_ ) { "checked $_" }
say ✔ 'under the bed'; # checked under the bed
Note that Perl 5 doesn't really have constants, they are just subroutines with an empty prototype.
use v5;
use constant foo => 12;
use v5;
sub foo () { 12 } # ditto
(This became less true after 5.16)
As far as I know all of the other uses of prototypes have been superseded by design decisions in Perl 6.
use v5;
sub foo (&$) { $_[0]->($_[1]) }
say foo { 100 + $_[0] } 5; # 105;
That block is seen as a sub lambda because of the prototype of the foo subroutine.
use v6;
# sub foo ( &f, $v ) { f $v }
sub foo { #_[0].( #_[1] ) }
say foo { 100 + #_[0] }, 5; # 105
In Perl 6 a block is seen as a lambda if a term is expected. So there is no need to alter the parser with a feature like a prototype.
You are asking for exactly one use of prototypes to be brought back even though there is already a feature that covers that use-case.
Doing so would be a special-case. Part of the design ethos of Perl 6 is to limit the number of special-cases.
Other versions of Perl had a wide variety of special-cases, and it isn't always easy to remember them all.
Don't get me wrong; the special-cases in Perl 5 are useful, but Perl 6 has for the most part made them general-cases.

Why does a Perl 6 Str do the Positional role, and how can I change []?

I'm playing around with a positional interface for strings. I'm aware of How can I slice a string like Python does in Perl 6?, but I was curious if I could make this thing work just for giggles.
I came up with this example. Reading positions is fine, but I don't know how to set up the multi to handle an assignment:
multi postcircumfix:<[ ]> ( Str:D $s, Int:D $n --> Str ) {
$s.substr: $n, 1
}
multi postcircumfix:<[ ]> ( Str:D $s, Range:D $r --> Str ) {
$s.substr: $r.min, $r.max - $r.min + 1
}
multi postcircumfix:<[ ]> ( Str:D $s, List:D $i --> List ) {
map( { $s.substr: $_, 1 }, #$i ).list
}
multi postcircumfix:<[ ]> ( Str:D $s, Int:D $n, *#a --> Str ) is rw {
put "Calling rw version";
}
my $string = 'The quick, purple butterfly';
{ # Works
my $single = $string[0];
say $single;
}
{ # Works
my $substring = $string[5..9];
say $substring;
}
{ # Works
my $substring = $string[1,3,5,7];
say $substring;
}
{ # NOPE!
$string[2] = 'Perl';
say $string;
}
The last one doesn't work:
T
uick,
(h u c)
Index out of range. Is: 2, should be in 0..0
in block <unit> at substring.p6 line 36
Actually thrown at:
in block <unit> at substring.p6 line 36
I didn't think it would work, though. I don't know what signature or traits it should have to do what I want to do.
Why does the [] operator work on a Str?
$ perl6
> "some string"[0]
some string
The docs mostly imply that the [] works on things that do the Positional roles and that those things are in list like things. From the [] docs in operators:
Universal interface for positional access to zero or more elements of a #container, a.k.a. "array indexing operator".
But a Str surprisingly does the necessary role even though it's not an #container (as far as I know):
> "some string".does( 'Positional' )
True
Is there a way to test that something is an #container?
Is there a way to get something to list all of its roles?
Now, knowing that a string can respond to the [], how can I figure out what signature will match that? I want to know the right signature to use to define my own version to write to this string through [].
One way to achieve this, is by augmenting the Str class, since you really only need to override the AT-POS method (which Str normally inherits from Any):
use MONKEY;
augment class Str {
method AT-POS($a) {
self.substr($a,1);
}
}
say "abcde"[3]; # d
say "abcde"[^3]; # (a b c)
More information can be found here: https://docs.raku.org/language/subscripts#Methods_to_implement_for_positional_subscripting
To make your rw version work correctly, you first need to make the Str which might get mutated also rw, and it needs to return something which in turn is also rw. For the specific case of strings, you could simply do:
multi postcircumfix:<[ ]> ( Str:D $s is rw, Int:D $i --> Str ) is rw {
return $s.substr-rw: $i, 1;
}
Quite often, you'll want an rw subroutine to return an instance of Proxy:
multi postcircumfix:<[ ]> ( Str:D $s is rw, Int:D $i --> Str ) is rw {
Proxy.new: FETCH => sub { $s.substr: $i },
STORE => sub -> $newval { $s.substr-rw( $i, 1 ) = $newval }
}
Although I haven't yet seen production code which uses it, there is also a return-rw operator, which you'll occasionally need instead of return.
sub identity( $x is rw ) is rw { return-rw $x }
identity( my $y ) = 42; # Works, $y is 42.
sub identity-fail( $x is rw ) is rw { return $x }
identity-fail( my $z ) = 42; # Fails: "Cannot assign to a readonly variable or a value"
If a function reaches the end without executing a return, return-rw or throwing an exception, the value of the last statement is returned, and (at present), this behaves as if it were preceded return-rw.
sub identity2( $x is rw ) is rw { $x }
identity2( my $w ) = 42; # Works, $w is 42.
There's a module that aims to let you do this:
https://github.com/zoffixznet/perl6-Pythonic-Str
However:
This module does not provide Str.AT-POS or make Str type do Positional or Iterable roles. The latter causes all sorts of fallout with core and non-core code due to inherent assumptions that Str type does not do those roles. What this means in plain English is you can only index your strings with [...] postcircumfix operator and can't willy-nilly treat them as lists of characters—simply call .comb if you need that.`

How can I use a non-caching infinite lazy list in Perl 6

Infinite lazy lists are awesome!
> my #fibo = 0, 1, *+* ... *;
> say #fibo[1000];
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
They automatically cache their values, which is handy ... most of the time.
But when working with huge Fibonacci numbers (example), this can cause memory issues.
Unfortunately, I can't figure out how to create a non-caching Fibonacci sequence. Anyone?
One major problem is you are storing it in an array, which of course keeps all of its values.
The next problem is a little more subtle, the dotty sequence generator syntax LIST, CODE ... END doesn't know how many of the previous values the CODE part is going to ask for, so it keeps all of them.
( It could look at the arity/count of the CODE, but it doesn't currently seem to from experiments at the REPL )
Then there is the problem that using &postcircumfix:<[ ]> on a Seq calls .cache on the assumption that you are going to ask for another value at some point.
( From looking at the source for Seq.AT-POS )
It's possible that a future implementation could be better at each of these drawbacks.
You could create the sequence using a different feature to get around the current limitations of the dotty sequence generator syntax.
sub fibonacci-seq (){
gather {
take my $a = 0;
take my $b = 1;
loop {
take my $c = $a + $b;
$a = $b;
$b = $c;
}
}.lazy
}
If you are just iterating through the values you can just use it as is.
my $v;
for fibonacci-seq() {
if $_ > 1000 {
$v = $_;
last;
}
}
say $v;
my $count = 100000;
for fibonacci-seq() {
if $count-- <= 0 {
$v = $_;
last;
}
}
say chars $v; # 20899
You could also use the Iterator directly. Though this isn't necessary in most circumstances.
sub fibonacci ( UInt $n ) {
# have to get a new iterator each time this is called
my \iterator = fibonacci-seq().iterator;
for ^$n {
return Nil if iterator.pull-one =:= IterationEnd;
}
my \result = iterator.pull-one;
result =:= IterationEnd ?? Nil !! result
}
If you have a recent enough version of Rakudo you can use skip-at-least-pull-one.
sub fibonacci ( UInt $n ) {
# have to get a new iterator each time this is called
my \result = fibonacci-seq().iterator.skip-at-least-pull-one($n);
result =:= IterationEnd ?? Nil !! result
}
You can also implement the Iterator class directly, wrapping it in a Seq.
( this is largely how methods that return sequences are written in the Rakudo core )
sub fibonacci-seq2 () {
Seq.new:
class :: does Iterator {
has Int $!a = 0;
has Int $!b = 1;
method pull-one {
my $current = $!a;
my $c = $!a + $!b;
$!a = $!b;
$!b = $c;
$current;
}
# indicate that this should never be eagerly iterated
# which is recommended on infinite generators
method is-lazy ( --> True ) {}
}.new
}
Apparently, a noob cannot comment.
When defining a lazy iterator such as sub fibonacci-seq2, one should mark the iterator as lazy by adding a "is-lazy" method that returns True, e.g.:
method is-lazy(--> True) { }
This will allow the system to detect possible infiniloops better.