How can I override ML_COMMENT in Xtext? - grammar

In Xtext 2.0, ML_COMMENT is defined in org.eclipse.xtext.common.Terminals as hidden.
I want to see comments in my grammar.
How can I undo this?

Just override the hidden statement from the inherited grammar:
grammar org.xtext.example.mydsl.MyDsl
with org.eclipse.xtext.common.Terminals
hidden(WS, SL_COMMENT) // <--- Override

Related

It is possible to write NQP's precedence parser in Raku

I'm trying to figure out how I can rewrite NQP's Precedence Parser in Raku :
The Precedence Parser is implemented here: https://github.com/Raku/nqp/blob/master/src/HLL/Grammar.nqp#L384
NQP should be a subset of Raku but the Grammar part seems to be specialized.
If I want to rewrite the Precedence Parser in EXPR() in Raku instead,
what would be the Raku Grammar primitives to use?
I.e. What would cursor_start_cur() translate to? is there a cursor in a Raku Grammar? How can I set pos of a Raku Match object ? What would $termcur.MATCH() translate to, etc...
I am not searching for different ways of writing a Precedence Parser,
but rather want to know whether it can be done in Raku in the same way that NQP does it.
jnthn wrote in IRC:
rule EXPR { <termish> [<infix> <termish>]* }
token termish { <prefix>* <term> <postfix>* }
and then done the precedence sorting in an action method.
There is an example https://github.com/Apress/perl-6-regexes-and-grammars/blob/master/chapter-13-case-studies/operator-precedence-parser-class.p6 from the book https://www.apress.com/us/book/9781484232279 that implement the same structure.

Code indentor using ANTLR 4

I'am writing a code indentor using ANTLR4 and Java. I have successfully generated the lexer and the parser. And the approach i am using is to walk through the generated parse tree.
ParseTreeWalker mywalker = new ParseTreeWalker();
mywalker.walk(myListener, myTree);
The auto-generated *BaseListener has methods like below...
#Override public void enterEveryRule(ParserRuleContext ctx) { }
I'm very new to ANTLR. But, As I understand, I need to extend *BaseListener and override the relevant methods and write code to indent, So my question is what are the methods that I should be overriding for indenting the input code file? Or if there is an alternate approach I should take, please let me know.
Thanks!
None. You don't need a parser for this task and you are limiting yourself to valid code, when you require a parser (hence you cannot reformat code with a syntax error). Instead take the lexer and iterate over all tokens. Keep a state to know where you are (a block, a function, whatever) and indent according to that.

my lexer token action is not invoked

I use antlr4 with javascript target.
Here is a sample grammar:
P : T ;
T : [a-z]+ {console.log(this.text);} ;
start: P ;
When I run the generated parser, nothing is printed, although the input is matched. If I move the action to the token P, then it gets invoked. Why is that?
Actions are ignored in referenced rules. This was the original behavior of ANTLR 4, back when the lexer only supported a single action per token (and that action must appear at the end of the token).
Several releases later the limitation of one-action-per-rule was lifted, allowing any number of actions to be executed for a token. However, we found that many existing users relied on the original behavior, and wrote their grammars assuming that actions in referenced rules were ignored. Many of these grammars used complicated logic in these rules, so changing the behavior would be a severe breaking change that would prevent people from using new versions of ANTLR 4.
Rather than break so many existing ANTLR 4 lexers, we decided to preserve the original behavior and only execute actions that appear in the same rule as the matched token. Newer versions do allow you to place multiple actions in each rule though.
tl;dr: We considered allowing actions in other rules to execute, but decided not to because it would break a lot of grammars already written and used by people.
I found that #init and #after actions will override this default behavior.
Change the example code to:
grammar Test;
ALPHA : [a-z]+;
p : t ;
t
#init {
console.log(this.text);
}
#after {
console.log(this.text);
}
: ALPHA;
start: p ;
I changed parser rules to LOWER case as my Eclipse tool was complaining about the syntax otherwise. I also had to insert ALPHA for [a-z]+; for the same reason. The above text compiled, but I haven't tried running the generated parser. However, I am successfully working around this issue with #init/#after in my larger parser.
Hope this is helpful.

composite grammars: accessing imported grammars scope's in action

Let's suppose I have two grammars (and that there is a Lexer defined somewhere), ParserA and ParserB.
In ParserA I have the following code:
parser grammar ParserA;
classDeclaration
scope {
ST mList;
}
...
ParserB is something like:
parser grammar ParserB;
import ParserA;
methodDeclaration : something something { $classDeclaration::mList.add(...) };
The code in the action will fail to compile (by javac) since classDeclaration is in a different class (and file). Any tips on how to fix it?
Any tips on how to fix it?
No, there's (AFAIK) no ANTLR shortcut here: there's no communication possible between imported grammars (either by using scopes or by providing parameters to imported grammar rules).

Antlr undefined import in Antlrworks using composite grammars

I'm trying to use a composite grammar with Antlr 3.1 and Antlrworks 1.4.2. When I put the import statement in, it says 'undefined import'. I've tried a number of different combinations of lexer grammar and parser grammer but can't get it to generate the code. Am I missing something obvious? Am example is below.
grammar Tokens;
TOKEN : 'token';
grammar Parser;
import Tokens;//gives undefined import error
rule : TOKEN+;
I'm referencing the documentation from
http://www.antlr.org/wiki/display/ANTLR3/Composite+Grammars
Thanks
When separating lexer- and parser grammars, you need to explicitly define what type of grammar it is.
Try:
parser grammar Parser;
import Tokens;//gives undefined import error
rule : TOKEN+;
and:
lexer grammar Tokens;
TOKEN : 'token';
Note that from a combined grammar file Foo.g, the lexer and parser get a Parser and Lexer prefix by default: FooLexer.java and FooParser.java respectively. But in "explicit" grammars, the name of the .java file is that of the grammar itself: Parser.java and Tokens.java in your case. You might want to watch out calling a class Parser since that is the name of ANTLR's base parser class:
http://www.antlr.org/api/Java/classorg_1_1antlr_1_1runtime_1_1_parser.html
Also watch out to place the import statement below the options { ... } section, but before any tokens { ... } you may have defined, otherwise you might get strange errors.
Argghh It was something stupid. Antlrworks will underline the import and highlight all the tokens as undefined syntax errors but still allow you to generate the code if you try!
The reason it wasn't working the first time was the import was above the options as per Bart's suggestions.