I can easily use token signatures by using token name directly:
my token t ( $x ) { $x };
'axb' ~~ / 'a' <t: 'x'> 'b' /; # match
'axb' ~~ / 'a' <t( 'x' )> 'b' /; # match
However I haven't found a way to do this, when token is stored in variable:
my $t = token ( $x ) { $x };
'axb' ~~ / 'a' <$t: 'x'> 'b' /;
'axb' ~~ / 'a' <$t( 'x' )> 'b' /;
Both give:
===SORRY!=== Error while compiling ...
Unable to parse expression in metachar:sym<assert>; couldn't find final '>'
What is the magic syntax to do that?
BTW: I've even browsed Raku test suite and it does not include such case in roast/S05-grammar/signatures.t.
Place an & before the variable:
my $t = token ( $x ) { $x };
say 'axb' ~~ / 'a' <&$t: 'x'> 'b' /;
say 'axb' ~~ / 'a' <&$t( 'x' )> 'b' /;
The parser looks for the &, and then delegates to the Raku variable parse rule, which will happily parse a contextualizer like this.
Either:
Use the solution in jnthn's answer to let Raku explicitly know you wish to use your $ sigil'd token variable as a Callable.
Declare the variable as explicitly being Callable in the first place and make the corresponding change in the call:
my &t = token ( $x ) { $x };
say 'axb' ~~ / 'a' <&t: 'x'> 'b' /; # 「axb」
say 'axb' ~~ / 'a' <&t( 'x' )> 'b' /; # 「axb」
Related
How do we flatten or stringify Match (or else) object to be string data type (esp. in multitude ie. as array elements)? e.g.
'foobar' ~~ m{ (foo) };
say $0.WHAT;
my $foo = $0;
say $foo.WHAT
(Match)
(Match)
How to end up with (Str)?
~ is the Str contextualizer:
'foobar' ~~ m{ (foo) };
say ~$0
will directly coerce it to a Str. You can use that if you have many matches, i. e.:
'foobar' ~~ m{ (f)(o)(o) };
say $/.map: ~*; # (f o o)
Just treat the objects as if they were strings.
If you apply a string operation to a value/object Raku will almost always just automatically coerce it to a string.
String operations include functions such as print and put, operators such as infix eq and ~ concatenation, methods such as .starts-with or .chop, interpolation such as "A string containing a $variable", and dedicated coercers such as .Str and Str(...).
A Match object contains an overall match. Any "children" (sub-matches) are just captures of substrings of that overall match. So there's no need to flatten anything because you can just deal with the single overall match.
A list of Match objects is a list. And a list is itself an object. If you apply a string operation to a list, you get the elements of the list stringified with a space between each element.
So:
'foobar' ~~ m{ (f) (o) (o) };
put $/; # foo
put $/ eq 'foo'; # True
put $/ ~ 'bar'; # foobar
put $/ .chop; # fo
put "[$/]"; # [foo]
put $/ .Str; # foo
my Str() $foo = $/;
say $foo.WHAT; # (Str)
put 'foofoo' ~~ m:g{ (f) (o) (o) }; # foo foo
The constructor for Str takes any Cool value as argument, including a regex Match object.
'foobar' ~~ m{ (foo) };
say $0.WHAT; # (Match)
say $0.Str.WHAT; # (Str)
Should be very simple, but I can't cope with it.
I want to match exactly the same number of as as bs. So, the following
my $input = 'aaabbbb';
$input ~~ m:ex/ ... /;
should produce:
aaabbb
aabb
ab
UPD: The following variants don't work, perhaps because of the :ex bug , mentioned in #smls's answer (but more likely because I made some mistakes?):
> my $input = "aaabbbb";
> .put for $input ~~ m:ex/ (a) * (b) * <?{ +$0 == +$1 }> /;
Nil
> .put for $input ~~ m:ex/ (a) + (b) + <?{+$0 == +$1}> /;
Nil
This one, with :ov and ?, works:
> my $input = "aaabbbb";
> .put for $input ~~ m:ov/ (a)+ (b)+? <?{+$0 == +$1}> /;
aaabbb
aabb
ab
UPD2: The following solution works with :ex as well, but I had to do it without <?...> assertion.
> $input = 'aaabbbb'
> $input ~~ m:ex/ (a) + (b) + { put $/ if +$0 == +$1 } /;
aaabbb
aabb
ab
my $input = "aaabbbb";
say .Str for $input ~~ m:ov/ (a)+ b ** {+$0} /;
Output:
aaabbb
aabb
ab
It's supposed to work with :ex instead of :ov, too - but Rakudo bug #130711 currently prevents that.
my $input = "aaabbbb";
say .Str for $input ~~ m:ov/ a <~~>? b /;
Works with ex too
my $input = "aaabbbb";
say .Str for $input ~~ m:ex/ a <~~>? b /;
Upd: explanation
<~~> means call myself recursively see Extensible metasyntax. (It is not yet fully implemented.)
Following (longer, but maybe clearer) example works too:
my $input = "aaabbbb";
my token anbn { a <&anbn>? b}
say .Str for $input ~~ m:ex/ <&anbn> /;
Here I make a regex manually from Regex elements of an array.
my Regex #reg =
/ foo /,
/ bar /,
/ baz /,
/ pun /
;
my $r0 = #reg[0];
my $r1 = #reg[1];
my Regex $r = / 0 $r0 | 1 $r1 /;
"0foo_1barz" ~~ m:g/<$r>/;
say $/; # (「0foo」 「1bar」)
How to do it with for #reg {...}?
If a variable contains a regex, you can use it without further ado inside another regex.
The second trick is to use an array variable inside a regex, which is equivalent to the disjunction of the array elements:
my #reg =
/foo/,
/bar/,
/baz/,
/pun/
;
my #transformed = #reg.kv.map(-> $i, $rx { rx/ $i $rx /});
my #match = "0foo_1barz" ~~ m:g/ #transformed /;
.say for #match;
my #reg =
/foo/,
/bar/,
/baz/,
/pun/
;
my $i = 0;
my $reg = #reg
.map({ $_ = .perl; $_.substr(1, $_.chars - 2); })
.map({ "{$i++}{$_}" })
.join('|');
my #match = "foo", "0foo_1barz" ~~ m:g/(<{$reg}>) /;
say #match[1][0].Str;
say #match[1][1].Str;
# 0foo
# 2baz
See the docs
Edit: Actually read the docs myself. Changed implicit eval to $() construct.
Edit: Rewrote answer to something that actually works
Edit: Changed answer to a terrible, terrible hack
When I make a regex variable with capturing groups, the whole match is OK, but capturing groups are Nil.
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / $rgx / ;
say ~$/; # 12abc34
say $0; # Nil
say $1; # Nil
If I modify the program to avoid $rgx, everything works as expected:
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / ($atom) \w+ ($atom) /;
say ~$/; # 12abc34
say $0; # 「12」
say $1; # 「34」
With your code, the compiler gives the following warning:
Regex object coerced to string (please use .gist or .perl to do that)
That tells us something is wrong—regex shouldn't be treated as strings. There are two more proper ways to nest regexes. First, you can include sub-regexes within assertions(<>):
my $str = 'nn12abc34efg';
my Regex $atom = / \d ** 2 /;
my Regex $rgx = / (<$atom>) \w+ (<$atom>) /;
$str ~~ $rgx;
Note that I'm not matching / $rgx /. That is putting one regex inside another. Just match $rgx.
The nicer way is to use named regexes. Defining atom and the regex as follows will let you access the match groups as $<atom>[0] and $<atom>[1]:
my regex atom { \d ** 2 };
my $rgx = / <atom> \w+ <atom> /;
$str ~~ $rgx;
The key observation is that $str ~~ / $rgx /; is a "regex inside of a regex". $rgx matched as it should and set $0 and $1 within it's own Match object, but then there was no where within the surrounding match object to store that information, so you couldn't see it. Maybe it's clear with an example, try this:
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / $0=$rgx /;
say $/;
Note the contents of $0. Or as another example, let's give it a proper name:
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / $<bits-n-pieces>=$rgx /;
say $/;
my language to parse contains statements like
public var a, b = 42, c;
I.e. the .g file looks something like:
statements
: (introduction | expression ';'! | ... )+
;
introduction
: head single+ -> ^(head single)+
;
single
: Name ('='^ expression)?
;
head
: modifiers* v='var' -> ^(VARIABLE[$v] modifiers*)
;
Generating a tree like that would be easy, but mostly useless (for me):
----------statements----------
/ | \
variable variable variable
/ \ / \ / \
'public' 'a' 'public' '=' 'public' 'c'
/ \
'b' expr
I would like to have the the '=' on top of the middle node:
----------statements----------
/ | \
variable '=' variable
/ \ / \ / \
'public' 'a' variable expr 'public' 'c'
/ \
'public' 'b'
but I can't find the rewrite rule to do that.
That is not easly done with the way you've set up your rules.
Here's a way it is possible:
grammar T;
options {
output=AST;
ASTLabelType=CommonTree;
}
tokens {
STATEMENTS;
VARIABLE;
DEFAULT_MODIFIER;
}
declaration
: modifier 'var' name[$modifier.tree] (',' name[$modifier.tree])* ';' -> ^(STATEMENTS name+)
;
modifier
: 'public'
| 'private'
| /* nothing */ -> DEFAULT_MODIFIER
;
name [CommonTree mod]
: ID '=' expression -> ^('=' ^(VARIABLE {new CommonTree(mod)} ID) expression)
| ID -> ^(VARIABLE {new CommonTree(mod)} ID)
;
// other parser & lexer rules
which produces the following AST:
for the input:
public var a, b = 42, c;
And produces:
for the input:
var a, b = 42, c;