Combining regexes using a loop in Perl 6 - raku

Here I make a regex manually from Regex elements of an array.
my Regex #reg =
/ foo /,
/ bar /,
/ baz /,
/ pun /
;
my $r0 = #reg[0];
my $r1 = #reg[1];
my Regex $r = / 0 $r0 | 1 $r1 /;
"0foo_1barz" ~~ m:g/<$r>/;
say $/; # (「0foo」 「1bar」)
How to do it with for #reg {...}?

If a variable contains a regex, you can use it without further ado inside another regex.
The second trick is to use an array variable inside a regex, which is equivalent to the disjunction of the array elements:
my #reg =
/foo/,
/bar/,
/baz/,
/pun/
;
my #transformed = #reg.kv.map(-> $i, $rx { rx/ $i $rx /});
my #match = "0foo_1barz" ~~ m:g/ #transformed /;
.say for #match;

my #reg =
/foo/,
/bar/,
/baz/,
/pun/
;
my $i = 0;
my $reg = #reg
.map({ $_ = .perl; $_.substr(1, $_.chars - 2); })
.map({ "{$i++}{$_}" })
.join('|');
my #match = "foo", "0foo_1barz" ~~ m:g/(<{$reg}>) /;
say #match[1][0].Str;
say #match[1][1].Str;
# 0foo
# 2baz
See the docs
Edit: Actually read the docs myself. Changed implicit eval to $() construct.
Edit: Rewrote answer to something that actually works
Edit: Changed answer to a terrible, terrible hack

Related

How to match the same number of different atoms in Perl 6 regex?

Should be very simple, but I can't cope with it.
I want to match exactly the same number of as as bs. So, the following
my $input = 'aaabbbb';
$input ~~ m:ex/ ... /;
should produce:
aaabbb
aabb
ab
UPD: The following variants don't work, perhaps because of the :ex bug , mentioned in #smls's answer (but more likely because I made some mistakes?):
> my $input = "aaabbbb";
> .put for $input ~~ m:ex/ (a) * (b) * <?{ +$0 == +$1 }> /;
Nil
> .put for $input ~~ m:ex/ (a) + (b) + <?{+$0 == +$1}> /;
Nil
This one, with :ov and ?, works:
> my $input = "aaabbbb";
> .put for $input ~~ m:ov/ (a)+ (b)+? <?{+$0 == +$1}> /;
aaabbb
aabb
ab
UPD2: The following solution works with :ex as well, but I had to do it without <?...> assertion.
> $input = 'aaabbbb'
> $input ~~ m:ex/ (a) + (b) + { put $/ if +$0 == +$1 } /;
aaabbb
aabb
ab
my $input = "aaabbbb";
say .Str for $input ~~ m:ov/ (a)+ b ** {+$0} /;
Output:
aaabbb
aabb
ab
It's supposed to work with :ex instead of :ov, too - but Rakudo bug #130711 currently prevents that.
my $input = "aaabbbb";
say .Str for $input ~~ m:ov/ a <~~>? b /;
Works with ex too
my $input = "aaabbbb";
say .Str for $input ~~ m:ex/ a <~~>? b /;
Upd: explanation
<~~> means call myself recursively see Extensible metasyntax. (It is not yet fully implemented.)
Following (longer, but maybe clearer) example works too:
my $input = "aaabbbb";
my token anbn { a <&anbn>? b}
say .Str for $input ~~ m:ex/ <&anbn> /;

How to interpolate variables into Perl 6 regex character class?

I want to make all the consonants in a word uppercase:
> my $word = 'camelia'
camelia
> $word ~~ s:g/<-[aeiou]>/{$/.uc}/
(「c」 「m」 「l」)
> $word
CaMeLia
To make the code more general, I store the list of all the consonants in a string variable
my $vowels = 'aeiou';
or in an array
my #vowels = $vowels.comb;
How to solve the original problem with $vowels or #vowels variables?
Maybe the trans method would be more appropriate than the subst sub or operator.
Try this:
my $word = "camelia";
my #consonants = keys ("a".."z") (-) <a e i o u>;
say $word.trans(#consonants => #consonants>>.uc);
# => CaMeLia
With the help of moritz's explanation, here is the solution:
my constant $vowels = 'aeiou';
my regex consonants {
<{
"<-[$vowels]>"
}>
}
my $word = 'camelia';
$word ~~ s:g/<consonants>/{$/.uc}/;
say $word; # CaMeLia
You can use <!before …> along with <{…}>, and . to actually capture the character.
my $word = 'camelia';
$word ~~ s:g/
<!before # negated lookahead
<{ # use result as Regex code
$vowel.comb # the vowels as individual characters
}>
>
. # any character (that doesn't match the lookahead)
/{$/.uc}/;
say $word; # CaMeLia
You can do away with the <{…}> with #vowels
I think it is also important to realize you can use .subst
my $word = 'camelia';
say $word.subst( :g, /<!before #vowels>./, *.uc ); # CaMeLia
say $word; # camelia
I would recommend storing the regex in a variable instead.
my $word = 'camelia'
my $vowel-regex = /<-[aeiou]>/;
say $word.subst( :g, $vowel-regex, *.uc ); # CaMeLia
$word ~~ s:g/<$vowel-regex>/{$/.uc}/;
say $word # CaMeLia

Scope of nested regexes in Perl 6

Is it possible to define nested regexes in arbitrary sequence?
The following program works as expected:
my regex letter { <[a b]> }
my regex word { <letter> + }
my $string = 'abab';
$string ~~ &word;
put $/; # abab
If I swap the first two lines, compiler produces an error.
Is there a way to override this restriction (without using grammars)?
You can put the regex in a variable you declare up front but later set:
my $letter;
my regex word { <$letter> + }
$letter = regex { <[a b]> }
my $string = 'abab';
$string ~~ &word;
put $/; # abab

Perl 6 regex variable and capturing groups

When I make a regex variable with capturing groups, the whole match is OK, but capturing groups are Nil.
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / $rgx / ;
say ~$/; # 12abc34
say $0; # Nil
say $1; # Nil
If I modify the program to avoid $rgx, everything works as expected:
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / ($atom) \w+ ($atom) /;
say ~$/; # 12abc34
say $0; # 「12」
say $1; # 「34」
With your code, the compiler gives the following warning:
Regex object coerced to string (please use .gist or .perl to do that)
That tells us something is wrong—regex shouldn't be treated as strings. There are two more proper ways to nest regexes. First, you can include sub-regexes within assertions(<>):
my $str = 'nn12abc34efg';
my Regex $atom = / \d ** 2 /;
my Regex $rgx = / (<$atom>) \w+ (<$atom>) /;
$str ~~ $rgx;
Note that I'm not matching / $rgx /. That is putting one regex inside another. Just match $rgx.
The nicer way is to use named regexes. Defining atom and the regex as follows will let you access the match groups as $<atom>[0] and $<atom>[1]:
my regex atom { \d ** 2 };
my $rgx = / <atom> \w+ <atom> /;
$str ~~ $rgx;
The key observation is that $str ~~ / $rgx /; is a "regex inside of a regex". $rgx matched as it should and set $0 and $1 within it's own Match object, but then there was no where within the surrounding match object to store that information, so you couldn't see it. Maybe it's clear with an example, try this:
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / $0=$rgx /;
say $/;
Note the contents of $0. Or as another example, let's give it a proper name:
my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;
$str ~~ / $<bits-n-pieces>=$rgx /;
say $/;

Perl replace with variable

I'm trying to replace a word in a string. The word is stored in a variable so naturally I do this:
$sentence = "hi this is me";
$foo=~ m/is (.*)/;
$foo = $1;
$sentence =~ s/$foo/you/;
print $newsentence;
But this doesn't work.
Any idea on how to solve this? Why this happens?
Perl lets you interpolate a string into a regular expression, as many of the answers have already shown. After that string interpolation, the result has to be a valid regex.
In your original try, you used the match operator, m//, that immediately tries to perform a match. You could have used the regular expression quoting operator in it's place:
$foo = qr/me/;
You can either bind to that directory or interpolate it:
$string =~ $foo;
$string =~ s/$foo/replacement/;
You can read more about qr// in Regexp Quote-Like Operators in perlop.
You have to replace the same variable, otherwise $newsentence is not set and Perl doesn't know what to replace:
$sentence = "hi this is me";
$foo = "me";
$sentence =~ s/$foo/you/;
print $sentence;
If you want to keep $sentence with its previous value, you can copy $sentence into $newsentence and perform the substitution, that will be saved into $newsentence:
$sentence = "hi this is me";
$foo = "me";
$newsentence = $sentence;
$newsentence =~ s/$foo/you/;
print $newsentence;
You first need to copy $sentence to $newsentence.
$sentence = "hi this is me";
$foo = "me";
$newsentence = $sentence;
$newsentence =~ s/$foo/you/;
print $newsentence;
Even for small scripts, please 'use strict' and 'use warnings'. Your code snippet uses $foo and $newsentence without initialising them, and 'strict' would have caught this. Remember that '=~' is for matching and substitution, not assignment. Also be aware that regexes in Perl aren't word-bounded by default, so the example expression you've got will set $1 to 'is me', the 'is' having matched the tail of 'this'.
Assuming you're trying to turn the string from 'hi this is me' to 'hi this is you', you'll need something like this:
my $sentence = "hi this is me";
$sentence =~ s/\bme$/\byou$/;
print $sentence, "\n";
In the regex, '\b' is a word boundary, and '$' is end-of-line. Just doing 's/me/you/' will also work in your example, but would probably have unintended effects if you had a string like 'this is merry old me', which would become 'this is yourry old me'.