What is Nim's approach to distinguish between commands? - grammar

I'm trying to understand what kind of approach is used by Nim to distinguish between commands.
There's the "separatist approach" where a semicolon just separates commands (used in Pascal for example), the "terminist approach" where a semicolon completely terminates the command (used in C, C++, Java, etc.) and the "liberal approach" where the programmer can decide whether or not to use a semicolon.
My thoughts are that Nim belongs to the liberal approach, but that would mean that semicolons could be added at the end of commands and Nim doesn't support that.
Any other thoughts?

I'm trying to understand what kind of approach is used by Nim to
distinguish between commands.
Why? This doesn't help in any way ... Nim has a complex syntax that doesn't readily fit into such boxes.
Your question is confused in several ways. First, what is a "command"? Semicolons separate statements or expressions. The difference between your categories matter mostly in expression languages--it determines whether the value of a block ending with a semicolon is the bottom value, or the value of the previous expression. "separatist" languages are confusing, error-prone, bad design, and obsolete--the mistakes of Algol are ancient history. Second, the categories don't make a lot of sense in languages like Nim where end-of-line is syntactically significant--a "missing" semicolon before a newline isn't really missing because the newline serves the same function. Thirdly, Nim most certainly does allow semicolons at the ends of expressions or statements (but it doesn't allow empty statements or expressions, so ;; is disallowed).
Consider:
proc a: int = 5 # returns 5
proc b: int = 5; # syntax error
proc c: int = # returns 5
5
proc d: int = # returns 5
5;
proc e: int = # syntax error
5;;
Since the ; that differentiates c and d makes no semantic difference, one could say that it's closer to "liberal" than to "separatist" or "terminist", but it isn't very liberal ... you can't just put semicolons anywhere.

Nim, like Python, is a whitespace-aware language. It uses newlines as statement separators and indentation to produce block structures.
Not all languages have visible statement separators, although some allow a visible statement separator in some circumstances. (For example, in Python simple statements can be separated by semicolons, but not compound statements.)
"There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy" (Hamlet I.5:159–167)

Related

Brace Delimiters with qq Don't Interpolate Code in Raku

Sorry if this is documented somewhere, but I haven't been able to find it. When using brace delimiters with qq, code is not interpolated:
qq.raku
#!/usr/bin/env raku
say qq{"Two plus two": { 2 + 2 }};
say qq["Two plus two": { 2 + 2 }];
$ ./qq.raku
"Two plus two": { 2 + 2 }
"Two plus two": 4
Obviously, this isn't a big deal since I can use a different set of delimiters, but I ran across it and thought I'd ask.
Update
As #raiph pointed out, I forgot to put the actual question: Is this the way it's supposed to work?
The quote language "nibbler" (the bit of the grammar that eats its way through a quoted string) looks like this:
[
<!stopper>
[
|| <starter> <nibbler> <stopper>
|| <escape>
|| .
]
]*
That is, until we see a stopper, eat whichever comes first of:
A starter (the opening { in your case), followed by some internal stuff, followed by a stopper (the }); this allows for nesting of the construct inside of the string
An escape (and closure interpolation is considered a kind of escape)
Any other character
This ordering in the grammar means that a nesting of the chosen quote starter/stopper will always win over an escape. This issue was discussed during the language design; we could, after all, have reordered the alternation in the grammar to have escapes win. On balance, however, it was felt that the choice of starter/stopper was the more local decision than the general properties of the quoting language, and so should take precedence. (This is also consistent with how quote languages are constructed: we take the base quoted string grammar and mix starter/stopper methods into it.)
Obviously, this isn't a big deal since I can use a different set of delimiters, but I ran across it and thought I'd ask.
You didn't ask anything. :)
Let's say you've got some text. And you want to use double quote processing to get interpolation, except you don't want braced text to be interpolated as code. You could write, say, qq:!c '...'. But don't you think it's a lot easier to remember, write, and read qq{ ... }?
Nice little touch, right?
Which is why it's the way it is -- it's a very nice touch.
And, perhaps, why it's not documented -- it's little, and, once you encounter it, obvious what you need to do.
That said, the Q lang escapes include ones to recursively re-enter the Q lang:
say qq{"Two plus two": \qq[{ 2 + 2 }] }; # "Two plus two": 4
Does that answer your question? :)

How to add a small bit of context in a grammar?

I am tasked to parse (and transform) a code of a computer language, that has a slight quirk in its rules, at least I see it this way. To be exact, the compiler treats new lines (as well as semicolons) as statement separators, but other than that (e.g. inside the statement) it treats them as spacers (whitespace).
As an example, this code:
try
local x = 5 / 0
catch (i)
print(i + "\n")
is proved to be equivalent to this:
try local x = 5 / 0 catch (i) print(i + "\n")
I don't see how I can express such a rule in EBNF, or specifically in Lark EBNF dialect. I mean in a sensible way. I probably could define all possible newline positions inside all statements, but it would be cumbersome and error-prone.
I wish to find a way to treat newlines contextually. Is there a proven method for this, preferably within Python/Lark domain? If I have to modify the parser for that purpose, then where should I start?
Or if I misunderstood something in this language in particular or in machine language parsing in general, or my statement of the problem is wrong, I'd also be happy to get educated.
(As you may guess, the language in question has a well proven implementation, but no officially defined grammar. Also, it is Squirrel, for all that it matters.)
The relevant quote from the "specification" is this:
A squirrel program is a simple sequence of statements.:
stats := stat [';'|'\n'] stats
[...] Statements can be separated with a new line or ‘;’ (or with the keywords case or default if inside a switch/case statement), both symbols are not required if the statement is followed by ‘}’.
These are relatively complex rules and in their totality not context free if newlines can also be ignored everywhere else. Note however that in my understanding the text implies that ; or \n are required when no of the other cases apply. That would make your example illegal. That probably means that the BNF as written is correct, e.g. both ; and \n are optionally everywhere. In that case you can (for lark) just put an %ignore "\n" statement and it should work fine.
Also, lark should not complain if you both ignore the \n and use it in a rule: Where useful it will match it in a rule, otherwise it will just ignore it. Note however that this breaks if you use a Terminal that includes the \n (e.g. WS or /\s/). Just have \n as an extra case.
(For the future: You will probably get faster response for lark questions if you ask over on gitter or at least put a link to SO there.)

How can I make Perl6 (MoarVM / Rakudo) warn about all missing semicolons?

In Perl 5, it's best to use
use strict;
use warnings;
to ask the compiler to complain about missing semicolons, undeclared variables, etc.
I have been informed by citizens of the Perl community here on SO that Perl 6 uses strict by default, and this seems after testing to be the case.
Semicolons aren't required for the last statement in a block, but if I extend the block later, I'll be chagrinned when my code doesn't work because it's the same block (and also I want semicolons everywhere because it's, like, consistent and stuff).
My assumption is that Perl 6 doesn't even look at semicolons for the last statement in a block, but I'm still curious: is there a way to make it stricter yet?
Rather than enforce the extra semi-colon, Rakudo does try to give you a good error/hint if you do add to your block and forget to separate statements.
Typically I get "Two terms in a row across lines (missing semicolon or comma?)" when this happens.

Variable evaluation in LateX

I have the following piece of latex code:
\def\a{1}
\def\b{2}
\def\c{\a+\b}
\def\d{\c/2}
I expected \d to have the value 1.5. But it did not. However, adding parenthesis to the definition of \c like
\def\c{\a+\b}
Doesn't work either, because if I use \c somewhere, it complains about the parenthesis. Is there a way to evaluate \c before dividing it by 2 in the definition of \d? Like:
\def\d{\eval{\c}/2}
(I made that \eval up to show what I mean)
You could use the calc package for arithmetic operations. The package fp works with real numbers.
For discussing LaTeX problems you're kindly invited to visit tex.stackexchange.com.
You need to remember that \def is about creating replacement text. It will always give you back what you put in, quite apart from not knowing anything about maths. If we assume you are using e-TeX (likely), then for integer expressions you might do
\def\a{1}
\def\b{2}
\edef\c{\number\intexpr \a + \b \relax}
\edef\d{\number\intexpr \c / 2 \relax}
This uses the e-TeX primitive \intexpr, which does integer mathematics. For real numbers, Stefan is right that the fp package is the best approach.

How can I extract field names from SQL with Perl?

I have a series of select statements in a text file and I need to extract the field names from each select query. This would be easy if some of the fields didn't use nested functions like to_char() etc.
Given select statement fields that could have several nested parenthese like:
ltrim(rtrim(to_char(base_field_name, format))) renamed_field_name,
Or the simple case of just base_field_name as a field, what would the regex look like in Perl?
Don't try to write a regex parser (though perl regexes can handle nested patterns like that), use SQL::Statement::Structure.
Why not ask the target database itself how it would interpret the queries?
In perl, one can use the DBI to query the prepared representation of a SQL query. Sometimes this is database-specific: some drivers (under the perl DBD:: namespace) support their RDBMS' idea of describing statements in ways analogous to the RDBMS' native C or C++ API.
It can be done generically, however, as the DBI will put the names of result columns in the statement handle attribute NAME. The following, for example, has a good chance of working on any DBI-supported RDBMS:
use strict;
use warnings;
use DBI;
use constant DSN => 'dbi:YouHaveNotToldUs:dbname=we_do_not_know';
my $dbh = DBI->connect(DSN, ..., { RaiseError => 1 });
my $sth;
while (<>) {
next unless /^SELECT/i; # SELECTs only, assume whole query on one line
chomp;
my $sql = /\bWHERE\b/i ? "$_ AND 1=0" : "$_ WHERE 1=0"; # XXX ugly!
eval {
$sth = $dbh->prepare($sql); # some drivers don't know column names
$sth->execute(); # until after a successful execute()
};
print $#, next if $#; # oops, problem with that one
print join(', ', #{$sth->{NAME}}), "\n";
}
The XXX ugly! bit there tries to append an always-false condition on the SELECT, so that the SQL engine doesn't have to do any real work when you execute(). It's a terribly naive approach -- that /\bWHERE\b/i test is no more correctly identifying a SQL WHERE clause than simple regexes correctly parse out SELECT field names -- but it is likely to work.
In a somewhat related problem at the office I used:
my #SqlKeyWordList = qw/select from where .../; # (1)
my #Candidates =split(/\s/,$SqlSelectQuery); # (2)
my %FieldHash; # (3)
for my $Word (#Candidates) {
next if grep($word,#SqlKeyWordList);
$FieldHash($Word)++;
}
Comments:
SqlKeyWordList contains all the SQL keywords that are potentially in the SQL statement (we use MySQL, there are many SQL dialiects, choosing/building this list is work, look at my comments below!). If someone decided to use a keyword as a field name, you will need a regex after all (beter to refactor the code).
Split the SQL statement into a list of words, this is the trickiest part and WILL REQUIRE tweeking. For now it uses Perl notion of "space" (=not in word) to split. Splitting the field list (select a,b,c) and the "from" portion of the SQL might be advisabel here, depends on your SQL statements.
%MyFieldHash will contain one entry per select field (and gunk, until you validated your SqlKeyWorkList and the regex in (2)
Beware
there is nothing in this code that could not be done in Python.
your life would be much easier if you can influence the creation of said SQL statements. (e.g. make sure each field is written to a comment)
there are so many things that can/will go wrong in this parsing approach, you really should sidestep the issue entirely, by changing the process (saves time in the long run).
this is the regex we use at the office
my #Candidates=split(/[\s
\(
\)
\+
\,
\*
\/
\-
\n
\
\=
\r
]+/,$SqlSelectQuery
);
How about splitting each line into terms (replace every parenthesis, comma and space with a newline), then sorting:
perl -ne's/[(), ]/\n/g; print' < textfile | sort -u
You'll end up with a lot of content like:
fieldname1
fieldname1
formatstring
ltrim
rtrim
t_char