antlr global rule scope declaration vs #members declaration - antlr

Which one would you prefer to declare a variable in which case, global scope or #members declaration? It seems to me that they can serve for same purpose?
UPDATE here is a grammar to explain what i mean.
grammar GlobalVsScope;
scope global{
int i;
}
#lexer::header{package org.inanme.antlr;}
#parser::header{package org.inanme.antlr;}
#parser::members {
int j;
}
start
scope global;
#init{
System.out.println($global::i);
System.out.println(j);
}:R EOF;
R:'which one';

Note that besides global (ANTLR) scopes, you can also have local rule-scopes, like this:
grammar T;
options { backtrack=true; }
parse
scope { String x; }
parse
: 'foo'? ID {$parse::x = "xyz";} rule*
| 'foo' ID
;
rule
: ID {System.out.println("x=" + $parse::x);}
;
The only time I'd consider using local rule-scopes is when there are a lot of predicates, or global backtracking is enabled (resulting in all rules to have predicates in front of them). In that case, you could create a member variable String x (or define it in a global scope) and set it in the parse rule, but you might be changing this instance/scope variable after which the parser could backtrack, and this backtracking will not cause the global variable to be set to it's original form/state! The local scoped variable will also not be "unset", but that will likely be less of a risk: them being local to a single rule.
To summarize: yes, you're right, global scopes and member/instance variables are much alike. But I'd sooner opt for members-variables because of the friendlier syntax.

Related

Are variable types permanent in Statically Typed Languages?

My understanding is that variable types are "checked" before run-time for statically typed languages.
I take this to mean that a var of type int can't ever be type string? Does this mean variable type can't change (within the same scope) throughout the program (in a statically typed language)?
Somebody mentioned "variable shadowing" but I'm pretty sure that only applies in different scopes.
var i = 'hi';
function foo() {
var i = 1;
}
My understanding of var shadowing is that i in the global scope is a different variable than i in the foo function scope and therefore their types are permanent and unrelated (in a static language, which JS is not). Is that right?
Somebody mentioned "variable shadowing" but I'm pretty sure that only applies in different scopes.
It depends on your definition of "scope", Rust, for example, allows the kind of shadowing that you're talking about, even within a single block:
fn main() {
let a: str = "hello";
let a: i32 = 3;
}
It could be argued that the declaration of a shadow variable implicitly ends the scope of the previous variable. But to quote from the Rust book:
Note that shadowing a name does not alter or destroy the value it was bound to, and the value will continue to exist until it goes out of scope, even if it is no longer accessible by any means.

What scope does ":my $foo" have and what is it used for?

With a regex, token or rule, its possible to define a variable like so;
token directive {
:my $foo = "in command";
<command> <subject> <value>?
}
There is nothing about it in the language documentation here, and very little in S05 - Regexes and Rules, to quote;
Any grammar regex is really just a kind of method, and you may declare variables in such a routine using a colon followed by any scope declarator parsed by the Perl 6 grammar, including my, our, state, and constant. (As quasi declarators, temp and let are also recognized.) A single statement (up through a terminating semicolon or line-final closing brace) is parsed as normal Perl 6 code:
token prove-nondeterministic-parsing {
:my $threshold = rand;
'maybe' \s+ <it($threshold)>
}
I get that regexen within grammars are very similar to methods in classes; I get that you can start a block anywhere within a rule and if parsing successfully gets to that point, the block will be executed - but I don't understand what on earth this thing is for.
Can someone clearly define what it's scope is; explain what need it fulfills and give the typical use case?
What scope does :my $foo; have?
:my $foo ...; has the lexical scope of the rule/token/regex in which it appears.
(And :my $*foo ...; -- note the extra * signifying a dynamic variable -- has both the lexical and dynamic scope of the rule/token/regex in which it appears.)
What this is used for
Here's what happens without this construct:
regex scope-too-small { # Opening `{` opens a regex lexical scope.
{ my $foo = / bar / } # Block with its own inner lexical scope.
$foo # ERROR: Variable '$foo' is not declared
}
grammar scope-too-large { # Opening `{` opens lexical scope for gramamr.
my $foo = / bar / ;
regex r1 { ... } # `$foo` is recognized inside `r1`...
...
regex r999 { ... } # ...but also inside r999
}
So the : ... ; syntax is used to get exactly the desired scope -- neither too broad nor too narrow.
Typical use cases
This feature is typically used in large or complex grammars to avoid lax scoping (which breeds bugs).
For a suitable example of precise lexical only scoping see the declaration and use of #extra_tweaks in token babble as defined in a current snapshot of Rakudo's Grammar.nqp source code.
P6 supports action objects. These are classes with methods corresponding one-to-one with the rules in a grammar. Whenever a rule matches, it calls its corresponding action method. Dynamic variables provide precisely the right scoping for declaring variables that are scoped to the block (method, rule, etc.) they're declared in both lexically and dynamically -- which latter means they're available in the corresponding action method too. For an example of this, see the declaration of #*nibbles in Rakudo's Grammar module and its use in Rakudo's Actions module.

Objective-C Variable Declaration Confusion

I am confused as to why I am allowed to do this (the if statement is to just show scope):
int i = 0;
if(true)
{
float i = 1.1;
}
I have a c# background and something like this is not allowed. Basically, the programmer is redeclaring the variable 'i', thus giving 'i' a new meaning. Any insight would be appreciated.
Thanks!
In C (and by extension, in Objective C) it is allowed to declare local variables in the inner scope that would hide variables of the outer scope. You can get rid of if and write this:
int i = 0;
{
// Here, the outer i becomes inaccessible
float i = 1.1;
{
int i = 2;
printf("%d", i); // 2 is printed
}
}
demo
C# standard decided against that, probably because it has a high probability of being an error, but C/Objective C does not have a problem with it.
Turn on "Hidden local variables" in your build settings to get a warning.
You're partially correct, yes, it gives i a new meaning, but it's not redeclaring the variable. It's another variable. But since the identifier is the same, the current scope will "hide" the previous, so any use of i inside that block refers to the float.
You're not redefining i, so much as shadowing i. This only works when the i's are declared at different levels of scope. C# allows shadowing, but not for if statements / switch statements, while C/C++/Objective-C allow such shadowing.
After the inner i goes out of scope, the identifier i will again refer to the int version of i. So it's not changing what the original i refers to. Shadowing a variable is generally not something you want to do (unless you're careful, shadowing is likely a mistake, especially for beginners).

ANTLR: Define new channel in grammar

I know it is possible to switch between the default and hidden token channels in an ANTLR grammar, but lets say I want a third channel. How can I define a new token channel in the gramar? For instance, lets say I want a channel named ALTERNATIVE.
They're just final int's in the Token class
, so you could simply introduce an extra int in your lexer like this:
grammar T;
#lexer::members {
public static final int ALTERNATIVE = HIDDEN + 1;
}
// parser rules ...
FOO
: 'foo' {$type=ALTERNATIVE;}
;
// other lexer rules ...
A related Q&A: How do I get an Antlr Parser rule to read from both default AND hidden channel
For the C target you can use
//This must be assigned somewhere
#lexer::context {
ANTLR3_UINT32 defaultChannel;
}
TOKEN : 'blah' {$channel=defaultChannel;};
This gets reset after every rule so if you want a channel assignment to persist across rules you may have to override nextTokenStr().

NullPointerException with ANTLR text attribute

I have a problem that I've been stuck on for a while and I would appreciate some help if possible.
I have a few rules in an ANTLR tree grammar:
block
: compoundstatement
| ^(VAR declarations) compoundstatement
;
declarations
: (^(t=type idlist))+
;
idlist
: IDENTIFIER+
;
type
: REAL
| i=INTEGER
;
I have written a Java class VarTable that I will insert all of my variables into as they are declared at the beginning of my source file. The table will also hold their variable types (ie real or integer). I'll also be able to use this variable table to check for undeclared variables or duplicate declarations etc.
So basically I want to be able to send the variable type down from the 'declarations' rule to the 'idlist' rule and then loop through every identifier in the idlist rule, adding them to my variable table one by one.
The major problem I'm getting is that I get a NullPointerException when I try and access the 'text' attribute if the $t variable in the 'declarations' rule (This is one one which refers to the type).
And yet if I try and access the 'text' attribute of the $i variable in the 'type' rule, there's no problem.
I have looked at the place in the Java file where the NullPointerException is being generated and it still makes no sense to me.
Is it a problem with the fact that there could be multiple types because the rule is
(^(typeidlist))+
??
I have the same issue when I get down to the idlist rule, becasue I'm unsure how I can write an action that will allow me to loop through all of the IDENTIFIER Tokens found.
Grateful for any help or comments.
Cheers
You can't reference the attributes from production rules like you tried inside tree grammars, only in parser (or combined) grammars (they're different objects!). Note that INTEGER is not a production rule, just a "simple" token (terminal). That's why you can invoke its .text attribute.
So, if you want to get a hold the text of the type rule in your tree grammar and print it in your declarations rule, your could do something like this:
tree grammar T;
...
declarations
: (^(t=type idlist {System.out.println($t.returnValue);}))+
;
...
type returns [String returnValue]
: i=INTEGER {returnValue = "[" + $i.text + "]";}
;
...
But if you really want to do it without specifying a return object, you could do something like this:
declarations
: (^(t=type idlist {System.out.println($t.start.getText());}))+
;
Note that type returns an instance of a TreeRuleReturnScope which has an attribute called start which in its turn is a CommonTree instance. You could then call getText() on that CommonTree instance.