How to represent multiple parents as rewrite rule? - antlr

Say I have the following ANTLR rule:
ROOT: 'r' ('0'..'9')*;
CHILD: 'c' ('0'..'9')*;
expression: ROOT ('.'^ CHILD)*;
For input such as r.c1.c2.c3, ANTLR would make the following tree:
.(.(.(r c1) c2) c3)
How can I represent the parent property of '.' without the ^ operator directly, i.e., in a rewrite rule?
expression: ROOT ('.' CHILD)* -> ?

The trick is to invoke the expression rule recursively in the rewrite rule (the $expression part below):
expression : (ROOT -> ROOT) ('.' CHILD -> ^('.' $expression CHILD))*;
which is equivalent to:
expression: ROOT ('.'^ CHILD)*;
Yeah, I know, it's not pretty, there is no simple syntax like you (may have) hoped for:
expression: ROOT ('.' CHILD)* -> ^(...);
See: Parr's Definitive ANTLR Reference, chapter 7, paragraph "Referencing Previous Rule ASTs in Rewrite Rules", page 174.

Related

ANTLR: exclude (skip) tokens when building AST tree

Given the following grammar (in ANTLR v3):
test : value0 COMMA_KEYWORD value1 (COMMA_KEYWORD value2)*;
How can we exclude (skip) COMMA_KEYWORD from the AST tree built by ANTLR (and without using a write rule)?
The alternative to using rewrite rules is to use tree construction operators:
https://theantlrguy.atlassian.net/wiki/spaces/ANTLR3/pages/2687090/Tree+construction
You can use ! operator to omit a token or subtree from AST:
test : value0 COMMA_KEYWORD! value1 (COMMA_KEYWORD! value2)*;

Explanation for the following grammar written in ANTLR 4

I have a sample grammar written in ANTLR 4
query : select from ';' !? EOF!
I have understood
query : select from ';'
how it works
What does !? EOF! means in the grammar and how it works?
The exclamation marks is used in ANTLR v3 grammars to denote that a certain node should be omitted from the generated AST. Since ANTLR v4 does not have AST's, this construct is no longer used.
In both v3 and v4, the ? denotes that a rule (lexer or parser) is optional and EOF means the end-of-file constant.
To summarize ';'!? means: optionally match a ';' and exclude it from the AST. And EOF! means: match the end-of-file and exclude this token from the AST.
So, the v3 parser rule:
query : select from ';'!? EOF!
should look like this in a v4 grammar:
query : select from ';'? EOF

caret prefix instead of postfix in antlr

I know what the caret postfix means in antlr(ie. make root) but what about when the caret is the prefix as in the following grammar I have been reading(this grammar is brand new and done by a new team learning antlr)....
selectClause
: SELECT resultList -> ^(SELECT_CLAUSE resultList)
;
fromClause
: FROM tableList -> ^(FROM_CLAUSE tableList)
;
Also, I know what => means but what about the -> ? What does -> imply?
thanks,
Dean
The ^ is used as an inline tree operator, indicating a certain token should become the root of the tree.
For example, the rule:
p : A B^ C;
creates the following AST:
B
/ \
A C
There's another way to create an AST which is using a rewrite rule. A rewrite rule is placed after (or at the right of) an alternative of a parser rule. You start a rewrite rule with an "arrow", ->, followed by the rules/tokens you want to be in the AST.
Take the previous rule:
p : A B C;
and you want to reverse the tokens, but keep the ASST "flat" (no root node). THis can be done using the following rewrite rule:
p : A B C -> C B A;
And if you want to create an AST similar to p : A B^ C;, you start your rewrite rule with ^( ... ) where the first token/rule inside the parenthesis will become the root node. So the rule:
p : A B C -> ^(B A C);
produces the same AST as p : A B^ C;.
Related:
Tree construction
How to output the AST built using ANTLR?

Match lowercase with ANTLR

I use ANTLRWorks for a simple grammar:
grammar boolean;
// [...]
lowercase_string
: ('a'..'z')+ ;
However, the lowercase_string doesn't match foobar according to the Interpreter (MismatchedSetException(10!={}). Ideas?
You can't use the .. operator inside parser rules like that. To match the range 'a' to 'z', create a lexer rule for it (lexer rules start with a capital).
Try it like this:
lowercase_string
: Lower+
;
Lower
: 'a'..'z'
;
or:
lowercase_string
: Lower
;
Lower
: 'a'..'z'+
;
Also see this previous Q&A: Practical difference between parser rules and lexer rules in ANTLR?

What does ^ and ! stand for in ANTLR grammar

I was having difficulty figuring out what does ^ and ! stand for in ANTLR grammar terminology.
Have a look at the ANTLR Cheat Sheet:
! don't include in AST
^ make AST root node
And ^ can also be used in rewrite rules: ... -> ^( ... ). For example, the following two parser rules are equivalent:
expression
: A '+'^ A ';'!
;
and:
expression
: A '+' A ';' -> ^('+' A A)
;
Both create the following AST:
+
/ \
A A
In other words: the + is made as root, the two A's its children, and the ; is omitted from the tree.