How can I make Perl6 (MoarVM / Rakudo) warn about all missing semicolons? - raku

In Perl 5, it's best to use
use strict;
use warnings;
to ask the compiler to complain about missing semicolons, undeclared variables, etc.
I have been informed by citizens of the Perl community here on SO that Perl 6 uses strict by default, and this seems after testing to be the case.
Semicolons aren't required for the last statement in a block, but if I extend the block later, I'll be chagrinned when my code doesn't work because it's the same block (and also I want semicolons everywhere because it's, like, consistent and stuff).
My assumption is that Perl 6 doesn't even look at semicolons for the last statement in a block, but I'm still curious: is there a way to make it stricter yet?

Rather than enforce the extra semi-colon, Rakudo does try to give you a good error/hint if you do add to your block and forget to separate statements.
Typically I get "Two terms in a row across lines (missing semicolon or comma?)" when this happens.

Related

How to add a small bit of context in a grammar?

I am tasked to parse (and transform) a code of a computer language, that has a slight quirk in its rules, at least I see it this way. To be exact, the compiler treats new lines (as well as semicolons) as statement separators, but other than that (e.g. inside the statement) it treats them as spacers (whitespace).
As an example, this code:
try
local x = 5 / 0
catch (i)
print(i + "\n")
is proved to be equivalent to this:
try local x = 5 / 0 catch (i) print(i + "\n")
I don't see how I can express such a rule in EBNF, or specifically in Lark EBNF dialect. I mean in a sensible way. I probably could define all possible newline positions inside all statements, but it would be cumbersome and error-prone.
I wish to find a way to treat newlines contextually. Is there a proven method for this, preferably within Python/Lark domain? If I have to modify the parser for that purpose, then where should I start?
Or if I misunderstood something in this language in particular or in machine language parsing in general, or my statement of the problem is wrong, I'd also be happy to get educated.
(As you may guess, the language in question has a well proven implementation, but no officially defined grammar. Also, it is Squirrel, for all that it matters.)
The relevant quote from the "specification" is this:
A squirrel program is a simple sequence of statements.:
stats := stat [';'|'\n'] stats
[...] Statements can be separated with a new line or ‘;’ (or with the keywords case or default if inside a switch/case statement), both symbols are not required if the statement is followed by ‘}’.
These are relatively complex rules and in their totality not context free if newlines can also be ignored everywhere else. Note however that in my understanding the text implies that ; or \n are required when no of the other cases apply. That would make your example illegal. That probably means that the BNF as written is correct, e.g. both ; and \n are optionally everywhere. In that case you can (for lark) just put an %ignore "\n" statement and it should work fine.
Also, lark should not complain if you both ignore the \n and use it in a rule: Where useful it will match it in a rule, otherwise it will just ignore it. Note however that this breaks if you use a Terminal that includes the \n (e.g. WS or /\s/). Just have \n as an extra case.
(For the future: You will probably get faster response for lark questions if you ask over on gitter or at least put a link to SO there.)

What is Nim's approach to distinguish between commands?

I'm trying to understand what kind of approach is used by Nim to distinguish between commands.
There's the "separatist approach" where a semicolon just separates commands (used in Pascal for example), the "terminist approach" where a semicolon completely terminates the command (used in C, C++, Java, etc.) and the "liberal approach" where the programmer can decide whether or not to use a semicolon.
My thoughts are that Nim belongs to the liberal approach, but that would mean that semicolons could be added at the end of commands and Nim doesn't support that.
Any other thoughts?
I'm trying to understand what kind of approach is used by Nim to
distinguish between commands.
Why? This doesn't help in any way ... Nim has a complex syntax that doesn't readily fit into such boxes.
Your question is confused in several ways. First, what is a "command"? Semicolons separate statements or expressions. The difference between your categories matter mostly in expression languages--it determines whether the value of a block ending with a semicolon is the bottom value, or the value of the previous expression. "separatist" languages are confusing, error-prone, bad design, and obsolete--the mistakes of Algol are ancient history. Second, the categories don't make a lot of sense in languages like Nim where end-of-line is syntactically significant--a "missing" semicolon before a newline isn't really missing because the newline serves the same function. Thirdly, Nim most certainly does allow semicolons at the ends of expressions or statements (but it doesn't allow empty statements or expressions, so ;; is disallowed).
Consider:
proc a: int = 5 # returns 5
proc b: int = 5; # syntax error
proc c: int = # returns 5
5
proc d: int = # returns 5
5;
proc e: int = # syntax error
5;;
Since the ; that differentiates c and d makes no semantic difference, one could say that it's closer to "liberal" than to "separatist" or "terminist", but it isn't very liberal ... you can't just put semicolons anywhere.
Nim, like Python, is a whitespace-aware language. It uses newlines as statement separators and indentation to produce block structures.
Not all languages have visible statement separators, although some allow a visible statement separator in some circumstances. (For example, in Python simple statements can be separated by semicolons, but not compound statements.)
"There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy" (Hamlet I.5:159–167)

What's the difference between parenthesis $() and curly bracket ${} syntax in Makefile?

Is there any differences in invoking variables with syntax ${var} and $(var)? For instance, in the way the variable will be expanded or anything?
There's no difference – they mean exactly the same (in GNU Make and in POSIX make).
I think that $(round brackets) look tidier, but that's just personal preference.
(Other answers point to the relevant sections of the GNU Make documentation, and note that you shouldn't mix the syntaxes within a single expression)
The Basics of Variable References section from the GNU make documentation state no differences:
To substitute a variable's value, write a dollar sign followed by the
name of the variable in parentheses or braces: either $(foo) or
${foo} is a valid reference to the variable foo.
As already correctly pointed out, there is no difference but be be wary not to mix the two kind of delimiters as it can lead to cryptic errors like in the GNU make example by unomadh.
From the GNU make manual on the Function Call Syntax (emphasis mine):
[…] If the arguments themselves contain other function calls or variable references, it is wisest to use the same kind of delimiters for all the references; write $(subst a,b,$(x)), not $(subst a,b,${x}). This is because it is clearer, and because only one type of delimiter is matched to find the end of the reference.
The ${} style lets you test the make rules in the shell, if you have the corresponding environment variables set, since that is compatible with bash.
Actually, it seems to be fairly different:
, = ,
list = a,b,c
$(info $(subst $(,),-,$(list))_EOL)
$(info $(subst ${,},-,$(list))_EOL)
outputs
a-b-c_EOL
md/init-profile.md:4: *** unterminated variable reference. Stop.
But so far I only found this difference when the variable name into ${...} contains itself a comma. I first thought ${...} was expanding the comma not as part as the value, but it turns out i'm not able to hack it this way. I still don't understand this... If anyone had an explanation, I'd be happy to know !
It makes a difference if the expression contains unbalanced brackets:
${info ${subst ),(,:-)}}
$(info $(subst ),(,:-)))
->
:-(
*** insufficient number of arguments (1) to function 'subst'. Stop.
For variable references, this makes a difference for functions, or for variable names that contain brackets (bad idea)

In awk, is there a functional difference between putting conditions outside the brackets or inside?

As far as I can tell, this:
$2 > 50{print}
functions exactly the same way as this:
{if ($2 > 50) print}
When should I use one vs the other, or is it a matter of style?
2nd one is much more readable in my opinion (especially for non awk gurus who may have to maintain you code).
It's mostly style. The first example is canonical AWK. There are, of course, times when an if statement is required.
Consider that:
$2 > 50 {foo; print}
takes 20 characters versus 27 for the following:
{if ($2 > 50) {foo; print}}
If you remove spaces, it would be 16 versus 22 characters.
So the former is more desirable if you are interested in economy (both in terms of actual characters and visually).
As potong points out, the if form requires curly braces if there are more than one statement (as I have shown in my second example above). If there is only one statement, they are optional. This is the source of subtle bugs if it's written without and additional statements are later added without adding braces. Also, for the reader of a script, the intent is much clearer if the braces are included.
Yes, they are the same. As with most syntax alternatives, there are trade offs.
I mostly agree with John3136 that 2nd form has a higher likelyhood of being understood by someone with background in c, c++, C#, java, and others.
And as Dennis Williamson points out, there are times you have to use the if (...) form.
For one-lines, I think the first form is useful, acceptable, and helps reinforce the primary design of awk, i.e. [pattern]-[action] (being optional or providing a default behaviour).
If you look at comp.lang.awk, you'll see that the long-timers mostly prefer style 1.
Large blocks of code in style 1, to my eye, are difficult to read, but do have the look of lex code.
So.... depends ;-) .... are you writing code that will ultimately be maintained by likely non-awk experts? Then stick w style 2. Writing for yourself? Use both so you don't get rusty!
I hope this helps.
Yes they are the same.
On the first form, you can omit { print } because it's the default action. So, that works too:
$2 > 50

Lack of block comments in VB .NET?

Just a question of interest: Does anyone know why there's no block comment capability in VB .NET? (Unless there really is - but I've never yet come across it.)
It is a side-effect of the Visual Basic syntax, a new-line terminates a statement. That makes a multi-line comment pretty incompatible with the basic way the compiler parses the language. Not an issue in the curly brace languages, new-lines are just white space.
It has never been a real problem, Visual Basic has had strong IDE support for a very long time. Commenting out multiple lines is an IDE feature, Edit + Advanced + Comment Selection.
Totally abusing compiler directives here... but:
#If False Then
Comments
go
here
#End If
You don't get the benefits of proper code coloration (it doesn't show in green when using the default color scheme) and the implicit line-continuation system automatically indents lines in a paragraph starting at the second line. But the compiler will ignore the text.
As can be read in “Comments in Code“ there isn't any other way:
If your comment requires more than one line, use the comment symbol on each line, as the following example illustrates.
' This comment is too long to fit on a single line, so we break
' it into two lines. Some comments might need three or more lines.
Similarly, the help on the REM statement states:
Note:
You cannot continue a REM statement by using a line-continuation sequence (_). Once a comment begins, the compiler does not examine the characters for special meaning. For a multiple-line comment, use another REM statement or a comment symbol (') on each line.
Depending on how many lines are to be ignored, one can use compiler directives instead. It may not be technically equivalent to comments (you don't get the syntax coloring of comments, for example), but it gets the job done without commenting many lines individually. So you just add 3 more lines of code.
#Const COMMENT = "C"
'basically a false statement
#If COMMENT = "Y" Then
'code to be commented goes between #If and #End If
MsgBox('Commenting failed!')
#End If
This is assuming the purpose is for ignoring blocks of code instead of adding documentation (what "comments" are actually used for, but I also wouldn't mind using compiler directives for that).
The effort required however, makes this method inconvenient when there are just around 10 lines to comment.
Reference: http://msdn.microsoft.com/en-us/library/tx6yas69.aspx