Is Perl 6's uncuddled else a special case for statement separation? - raku

From the syntax doc:
A closing curly brace followed by a newline character implies a statement separator, which is why you don't need to write a semicolon after an if statement block.
if True {
say "Hello";
}
say "world";
That's fine and what was going on with Why is this Perl 6 feed operator a “bogus statement”?.
However, how does this rule work for an uncuddled else? Is this a special case?
if True {
say "Hello";
}
else {
say "Something else";
}
say "world";
Or, how about the with-orwith example:
my $s = "abc";
with $s.index("a") { say "Found a at $_" }
orwith $s.index("b") { say "Found b at $_" }
orwith $s.index("c") { say "Found c at $_" }
else { say "Didn't find a, b or c" }

The documentation you found was not completely correct. The documentation has been updated and is now correct. It now reads:
Complete statements ending in bare blocks can omit the trailing semicolon, if no additional statements on the same line follow the block's closing curly brace }.
...
For a series of blocks that are part of the same if/elsif/else (or similar) construct, the implied separator rule only applies at the end of the last block of that series.
Original answer:
Looking at the grammar for if in nqp and Rakudo, it seems that an if/elsif/else set of blocks gets parsed out together as one control statement.
Rule for if in nqp
rule statement_control:sym<if> {
<sym>\s
<xblock>
[ 'elsif'\s <xblock> ]*
[ 'else'\s <else=.pblock> ]?
}
(https://github.com/perl6/nqp/blob/master/src/NQP/Grammar.nqp#L243, as of August 5, 2017)
Rule for if in Rakudo
rule statement_control:sym<if> {
$<sym>=[if|with]<.kok> {}
<xblock(so ~$<sym>[0] ~~ /with/)>
[
[
| 'else'\h*'if' <.typed_panic: 'X::Syntax::Malformed::Elsif'>
| 'elif' { $/.typed_panic('X::Syntax::Malformed::Elsif', what => "elif") }
| $<sym>='elsif' <xblock>
| $<sym>='orwith' <xblock(1)>
]
]*
{}
[ 'else' <else=.pblock(so ~$<sym>[-1] ~~ /with/)> ]?
}
(https://github.com/rakudo/rakudo/blob/nom/src/Perl6/Grammar.nqp#L1450 as of August 5, 2017)

Related

Tcl : Guaranteed evaluation sequence of a boolean expression?

Let's say I have a conditional Tcl expression that is a boolean combination of steps.
Will the expression always be evaluated left to right (excluding parentheses)?
If the expression becomes true will the rest of the evaluation stop?
I have this piece of code that parses a file and conditionally replaces stuff in the lines.
set fp [ open "file" ]
set data [ read $fp ]
close $fp
foreach line [ split $data \n ] {
if { $enable_patch && [ regsub {<some_pattern>} $line {<some_other_pattern>} line ]} {
puts $outfp $line
<do_some_more_stuff>
}
}
So my issue here is that unless enable_patch is true, I don't want the line to be modified. Now my test shows that the code is deterministic in Tcl 8.5 on Linux. But I am wondering if this would break under other conditions/ versions/ OSes.
Yes, the || and && operators are "short-circuiting" operators in TCL. That means you can rely on them being evaluated left-to-right, and that evaluation will stop as soon as the value of the expression is known.

Scope of nested regexes in Perl 6

Is it possible to define nested regexes in arbitrary sequence?
The following program works as expected:
my regex letter { <[a b]> }
my regex word { <letter> + }
my $string = 'abab';
$string ~~ &word;
put $/; # abab
If I swap the first two lines, compiler produces an error.
Is there a way to override this restriction (without using grammars)?
You can put the regex in a variable you declare up front but later set:
my $letter;
my regex word { <$letter> + }
$letter = regex { <[a b]> }
my $string = 'abab';
$string ~~ &word;
put $/; # abab

Perl6 grammars: match full line

I've just started exploring perl6 grammars. How can I make up a token "line" that matches everything between the beginning of a line and its end? I've tried the following without success:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP {
<line>
}
token line {
^^.*$$
}
}
my $match = sample.parse($txt);
say $match<line>[0];
I can see 2 problem in your Grammar here, the first one here is the token line, ^^ and $$ are anchor to start and end of line, howeve you can have new line in between. To illustrate, let's just use a simple regex, without Grammar first:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
if $txt ~~ m/^^.*$$/ {
say "match";
say $/;
}
Running that, the output is:
match
「row 1
row 2
row 3」
You see that the regex match more that what is desired, however the first problem is not there, it is because of ratcheting, matching with a token will not work:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
my regex r {^^.*$$};
if $txt ~~ &r {
say "match regex";
say $/;
} else {
say "does not match regex";
}
my token t {^^.*$$};
if $txt ~~ &t {
say "match token";
say $/;
} else {
say "does not match token";
}
Running that, the output is:
match regex
「row 1
row 2
row 3」
does not match token
I am not really sure why, but token and anchor $$ does not seems to work well together. But what you want instead is searching for everything except a newline, which is \N*
The following grammar solve mostly your issue:
grammar sample {
token TOP {<line>}
token line {\N+}
}
However it only matches the first occurence, as you search for only one line, what you might want to do is searching for a line + an optional vertical whitespace (In your case, you have a new line at the end of your string, but i guess you would like to take the last line even if there is no new line at the end ), repeated several times:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP {[<line>\v?]*}
token line {\N+}
}
my $match = sample.parse($txt);
for $match<line> -> $l {
say $l;
}
Output of that script begin:
「row 1」
「row 2」
「row 3」
Also to help you using and debugging Grammar, 2 really usefull modules : Grammar::Tracer and Grammar::Debugger . Just include them at the beginning of the script. Tracer show a colorful tree of the matching done by your Grammar. Debugger allows you to see it matching step by step in real time.
Your original aproach can be made to work via
grammar sample {
token TOP { <line>+ %% \n }
token line { ^^ .*? $$ }
}
Personally, I would not try to anchor line and use \N instead as already suggested.
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP {
<line>+
}
token line {
\N+ \n
}
}
my $match = sample.parse($txt);
say $match<line>[0];
Or if you can be specific about the line:
grammar sample {
token TOP {
<line>+
}
rule line {
\w+ \d
}
}
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP { <line> }
token line { .* }
}
for $txt.lines -> $line {
## An single line of text....
say $line;
## Parse line of text to find match obj...
my $match = sample.parse($line);
say $match<line>;
}

awk: "default" action if no pattern was matched?

I have an awk script which checks for a lot of possible patterns, doing something for each pattern. I want something to be done in case none of the patterns was matched. i.e. something like this:
/pattern 1/ {action 1}
/pattern 2/ {action 2}
...
/pattern n/ {action n}
DEFAULT {default action}
Where of course, the "DEFAULT" line is no awk syntax and I wish to know if there is such a syntax (like there usually is in swtich/case statements in many programming languages).
Of course, I can always add a "next" command after each action, but this is tedious in case I have many actions, and more importantly, it prevents me from matching the line to two or more patterns.
You could invert the match using the negation operator ! so something like:
!/pattern 1|pattern 2|pattern/{default action}
But that's pretty nasty for n>2. Alternatively you could use a flag:
{f=0}
/pattern 1/ {action 1;f=1}
/pattern 2/ {action 2;f=1}
...
/pattern n/ {action n;f=1}
f==0{default action}
GNU awk has switch statements:
$ cat tst1.awk
{
switch($0)
{
case /a/:
print "found a"
break
case /c/:
print "found c"
break
default:
print "hit the default"
break
}
}
$ cat file
a
b
c
d
$ gawk -f tst1.awk file
found a
hit the default
found c
hit the default
Alternatively with any awk:
$ cat tst2.awk
/a/ {
print "found a"
next
}
/c/ {
print "found c"
next
}
{
print "hit the default"
}
$ awk -f tst2.awk file
found a
hit the default
found c
hit the default
Use the "break" or "next" as/when you want to, just like in other programming languages.
Or, if you like using a flag:
$ cat tst3.awk
{ DEFAULT = 1 }
/a/ {
print "found a"
DEFAULT = 0
}
/c/ {
print "found c"
DEFAULT = 0
}
DEFAULT {
print "hit the default"
}
$ gawk -f tst3.awk file
found a
hit the default
found c
hit the default
It's not exaclty the same semantics as a true "default" though so it's usage like that could be misleading. I wouldn't normally advocate using all-upper-case variable names but lower case "default" would clash with the gawk keyword so the script wouldn't be portable to gawk in future.
As mentioned above by tue, my understanding of the standard approach in Awk is to put next at each alternative and then have a final action without a pattern.
/pattern1/ { action1; next }
/pattern2/ { action2; next }
{ default-action }
The next statement will guarantee that no more patterns are considered for the line in question. And the default-action will always happen if the previous ones don't happen (thanks to all the next statements).
There is no "maintanance free" solution for a DEFAULT-Branch in awk.
The first possibility i would suggest is to complete each branch of a pattern match with a 'next' statement. So it's like a break statement. Add a final action at the end that matches everything. So it's the DEAFULT branch.
The other possibility would be:
set a flag for each branch that has a pattern match (i.e. your non-default branches)
e.g. start your actions with NONDEFAULT=1;
Add a last action at the end (the default branch) and define a condition NONDEFAULT==0 instaed of a reg expression match.
A fairly clean, portable workaround is using an if statement:
Instead of:
pattern1 { action1 }
pattern2 { action2 }
...
one could use the following:
{
if ( pattern1 ) { action1 }
else if ( pattern2 ) { action2 }
else { here is your default action }
}
As mentioned above, GNU awk has switch statements, but other awk implementations don't, so using switch would not be portable.

Lex : line with one character but spaces

I have sentences like :
" a"
"a "
" a "
I would like to catch all this examples (with lex), but I don't how to say the beginning of the line
I'm not totally sure what exactly you're looking for, but the regex symbol to specify matching the beginning of a line in a lex definition is the caret:
^
If I understand correctly, you're trying to pull the "a" out as the token, but you don't want to grab any of the whitespace? If this is the case, then you just need something like the following:
[\n\t\r ]+ {
// do nothing
}
"a" {
assignYYText( yylval );
return aToken;
}