Antlr4 No viable alternative at input symbols - intellij-idea

I'm implementing a simple program walker grammar and I get this common error in multiple lines. I think it is caused by same reason, but I'm new to antlr so I couldn't figure it out.
For example, in this following code snippet:
program
: (declaration)*
(statement)*
EOF!
;
I got error:
No viable alternative at input '!'
after EOF, and I got a similar error with:
declaration
: INT VARNUM '=' expression ';'
-> ^(DECL VARNUM expression)
;
I got the error:
No viable alternative at input '->'
After reading other questions, I know that matching one token with multiple definitions can cause this problem. But I haven't test it with any input yet, I got this error in intelliJ. How can I fix my problem?

This is ANTLR v3 syntax, you're trying to compile it with ANTLR v4, which won't work.
Either downgrade to ANTLR v3, or use v4 syntax. The difference comes from the fact that v4 doesn't support automatic AST generation, and you're trying to use AST construction operators, which were removed.
The first snippet only requires you to remove the !. Parentheses aren't necessary.
program
: declaration*
statement*
EOF
;
As for the second one, remove everything after the ->:
declaration
: INT VARNUM '=' expression ';'
;
If you need to build an AST with v4, see my answer here.

Related

Upgrading Grammar file to Antlr4

I am upgrading my Antlr grammar file to latest Antlr4.
I have converted most of the file but stuck in syntax difference that I can't figure out. The 3 such difference is:
equationset: equation* EOF!;
equation: variable ASSIGN expression -> ^(EQUATION variable expression)
;
orExpression
: andExpression ( OR^ andExpression )*
;
In first one, the error is due to !. I am not sure whether EOF and EOF! is same or not. Removing ! resolves the error, but I want to be sure that is the correct fix.
In 2nd rule, -> and ^ is giving error. I am not sure what is Antlr4 equivalent.
In 3rd rule, ^ is giving error. Removing it fixes the error, but I can't find any migration guide that explains what should be equivalent for this.
Can you please give me the Antrl4 equivalent of these 3 rules and give some brief explanation what is the difference? If you can refer to any other resource where I can find the answer is OK as well.
Thanks in advance.
Many of the ANTLR3 grammars contain syntax tree manipulations which are no longer supported with ANTLR4 (now we get a parse tree instead of a syntax tree). What you see here is exactly that.
EOF! means EOF should be matched but not appear in the AST. Since there is no AST anymore you cannot change that, so remove the exclamation mark.
The construct -> ^(EQUATION variable expression) rewrites the AST created by the equation rule. Since there is no AST anymore you cannot change that, so remove that part.
OR^ finally determines that the OR operator should become the root of the generated AST. Since there is no AST anymore ..., you got the point now :-)

Antlr: Unintended behavior

Why this simple grammar
grammar Test;
expr
: Int | expr '+' expr;
Int
: [0-9]+;
doesn't match the input 1+1 ? It says "No method for rule expr or it has arguments" but in my opition it should be matched.
It looks like I haven't used ANTLR for a while... ANTLRv3 did not support left-recursive rules, but ANTLRv4 does support immediate left recursion. It also supports the regex-like character class syntax you used in your post. I tested this version and it works in ANTLRWorks2 (running on ANTLR4):
grammar Test;
start : expr
;
expr : expr '+' expr
| INT
;
INT : [0-9]+
;
If you add the start rule then ANTLR is able to infer that EOF goes at the end of that rule. It doesn't seem to be able to infer EOF for more complex rules like expr and expr2 since they're recursive...
There are a lot of comments below, so here is (co-author of ANTLR4) Sam Harwell's response (emphasis added):
You still want to include an explicit EOF in the start rule. The problem the OP faced with using expr directly is ANTLR 4 internally rewrote it to be expr[int _p] (it does so for all left recursive rules), and the included TestRig is not able to directly execute rules with parameters. Adding a start rule resolves the problem because TestRig is able to execute that rule. :)
I've posted a follow-up question with regard to EOF: When is EOF needed in ANTLR 4?
If your command looks like this:
grun MYGRAMMAR xxx -tokens
And this exception is thrown:
No method for rule xxx or it has arguments
Then this exception will get thrown with the rule you specified in the command above. It means the rule probably doesn't exist.
System.err.println("No method for rule "+startRuleName+" or it has arguments");
So startRuleName here, should print xxx if it's not the first (start) rule in the grammar. Put xxx as the first rule in your grammar to prevent this.

ANTLR error(99) grammar has no rules

I previously posted about my first attempt at using ANTLR when I was having issues with left recursion.
Now that I have resolved those issues, I am getting the following error when I try to use org.antlr.v4.Tool to generate the code:
error(99): C:test.g4::: grammar 'test' has no rules
What are the possible reasons for this error? Using ANTLRWorks I can certainly see rules in the Parse Tree so why can't it see them? Is it because it cannot find a suitable START rule?
I think Antlr expects the first rule name to be in small case. I was getting the same error with my grammar
grammar ee;
Condition : LogicalExpression ;
LogicalExpression : BooleanLiteral ;
BooleanLiteral : True ;
True : 'true' ;
By changing the first production rule in the grammar to lower case it solved the issue i.e. the below grammar worked for me.
grammar ee;
condition : LogicalExpression ;
LogicalExpression : BooleanLiteral ;
BooleanLiteral : True ;
True : 'true' ;
Note: It is my personal interpretation, I couldn't find this reasoning in the online documentation.
Edit: The production rules should begin with lower case letters as specified in the latest docs [1]
[1] https://github.com/antlr/antlr4/blob/master/doc/lexicon.md#identifiers
I'm not sure if you've found the solution for this, but I had the same problem and fixed it by changing my start symbol to 'prog'. So for example, the first two lines of your .g4 file would be:
grammar test;
prog : <...> ;
Where <...> will be your first derivation.
I just got that error as well (antlworks 2.1).
switching from
RULE : THIS | THAT ; to rule : this | that ; for parser rules (i.e. from uppercase to lowercase) solved the problem!
EDIT
The above correction holds only for RULE , what follows after the : can be any combination of lexer/parser rules
The most likely cause is just what the error message suggests. And the most likely reason for that is that you have not saved your grammar to the file--or if you're using ANTLRWorks2--ANTLRWorks hasn't saved your work to the file. I have no idea why ANTLRWorks doesn't save reliably.
I also got the same error but could not fix it.
I downloaded antlrworks-1.4.jar and it's working perfectly.
Download >> antlrworks-1.4.jar
Changing the first rule to start with a lower case character worked for me.
I had the same problem, and this means that your grammar has no Syntactic rules. So in order to avoid this error, you need to write at least one Syntactic rule.

Mismatched double token

In ANTLR, I have a MismatchedTokenException with the following definition:
type : IDENTIFIER ('<' (type (',' type)*) '>')?;
And the following test:
A<B,C<D>>
The exception occurs when parsing the first >. ANTLR tries parsing both '>>' at once, and fails.
With a silent whitespace channel, the following test does work:
A<B,C<D> >
In which ANTLR is clearly instructed to treat each token separately.
How can I fix that?
I could not reproduce that. The parser generated by:
grammar T;
type : IDENTIFIER ('<' (type (',' type)*) '>')?;
IDENTIFIER : 'A'..'Z';
parses the input A<B,C<D>> (without spaces) into the following parse tree:
You'll need to provide the grammar that causes this input to produce a MismatchedTokenException.
Perhaps you're using ANTLRWorks' interpreter (or Eclipse's ANTLR-IDE, which uses the same interpreter)? In that case, that is probably the problem: it's notoriously buggy. Don't use it, but use ANTLRWorks' debugger: it's great (the image posted above comes from the debugger).
Lazlo Bonin wrote:
Got it. I had a << token defined. Quickly, is there a way to priorize token recognition over another?
No, the lexer simply tries to match as much as possible. So if it can create a token matching << (or >>), it will do so in favor of two single < (or >) tokens. Only when two (or more) lexer rules match the same amount of characters, a prioritization is made: the rule defined first will then "win" over the one(s) defined later in the grammar.

Yacc "rule useless due to conflicts"

i need some help with yacc.
i'm working on a infix/postfix translator, the infix to postfix part was really easy but i'm having some issue with the postfix to infix translation.
here's an example on what i was going to do (just to translate an easy ab+c- or an abc+-)
exp: num {printf("+ ");} exp '+'
| num {printf("- ");} exp '-'
| exp {printf("+ ");} num '+'
| exp {printf("- ");} num '-'
|/* empty*/
;
num: number {printf("%d ", $1);}
;
obiously it doesn't work because i'm asking an action (with the printfs) before the actual body so, while compiling, I get many
warning: rule useless in parser due to conflict
the problem is that the printfs are exactly where I need them (or my output wont be an infix expression). is there a way to keep the print actions right there and let yacc identify which one it needs to use?
Basically, no there isn't. The problem is that to resolve what you've got, yacc would have to have an unbounded amount of lookahead. This is… problematic given that yacc is a fairly simple-minded tool, so instead it takes a (bad) guess and throws out some of your rules with a warning. You need to change your grammar so yacc can decide what to do with a token with only a very small amount of lookahead (a single token IIRC). The usual way to do this is to attach the interpretations of the values to the tokens and either use a post-action or, more practically, build a tree which you traverse as a separate step (doing print out of an infix expression from its syntax tree is trivial).
Note that when you've got warnings coming out of yacc, that typically means that your grammar is wrong and that the resulting parser will do very unexpected things. Refine it until you get no warnings from that stage at all. That is, treat grammar warnings as errors; anything else and you'll be sorry.