ANTLR TestRig ClassCastException - antlr

Very basic question. I've been trying to follow some examples for ANTLR4 to generate a lexer and parser grammar for the XML language. Both XMLLexer.g4 and XMLParser.g4 are in the same directory.
I'm getting this error when trying to run TestRig
Exception in thread "main" java.lang.ClassCastException: class XMLParser
at java.base/java.lang.Class.asSubclass(Class.java:3640)
at org.antlr.v4.gui.TestRig.process(TestRig.java:135)
at org.antlr.v4.gui.TestRig.main(TestRig.java:119)
(I didn't get any error when executing org.antlr.v4.Tool or javac)
What's strange is that if I combine the lexer and the parser grammars in a single file, I can successfully run TestRig.
Any hints to what's going on will be much appreciated.
Thanks!
Jay

Related

Throw Custom Exception From ANTLR4 Grammar File

I have a grammar file which parse a specific file type. Now I need a simple thing.
I have a parser rule, and when the parser rule doesn't satisfy with the input token I need to throw my own custom exception.
That is, when my input file is giving me an extraneous error because the parser is expecting something and the i/p file doesn't has that. I want to throw an exception in this scenario.
Is this possible ?
If yes, How ?
If no, any work around ?
I'm beginner in this skill.
grammar Test
exampleParserRule : [a-z]+ ;
My input file contains 12345. Now I need to throw a custom exception
For parsing issues such as this, ANTLR will, internally, throw certain exceptions, but they are caught and handled by an ErrorListener. By default, ANTLR will hook up a ConsoleErrorListener that just formats and writes error messages to the console. But it will then continue processing, attempting to recover from the error, and sync back up with your input. This is something you want in a parser. It’s not very useful to have a parser just report the first problem it encounters and then exception out.
You can implement your own ErrorListener (there’s a BaseErrorListener class you can subclass). There you can handle the reported error yourself (the method you override provide a lot of detailed information about the error) and produce whatever message you’d like. (You can also do things like collect all the errors in a list, keep track of error levels, etc.)
In short, you probably don’t want a different exception, you want a different error message.
Sometimes, depending on how difficult it is to sort out your particular situation for a custom message, it’s really better to look for it in a Listener that processes the parse tree that ANTLR hands back. (Note: a very common path beginners take is to try to get everything into the grammar. It’s going to be hard to get really nice error messages if you do this. ANTLR is pretty good with error messages, but, as a generalized tool, it’s just more likely you can produce more meaningful messages.)
Just try to get ANTLR to produce a parse tree that accurately reflects the structure of your input. Then you can walk the ParseTree with a validation Listener of your own code, producing your own messages.
Another “trick” that doesn’t occur to many ANTLR devs (early on), is that the grammar doesn’t HAVE to ONLY include rules for valid input. If there’s some particular input you want to give a more helpful error message for, you can add a rule to match that (invalid) input, and when you encounter that Context in your validation listener, generate an error message specific to that construct.
BTW… [a-z]+ would almost always be a Lexer rule. If you don’t yet understand the difference between lexer and parser rules, or the processing pipeline ANTLR uses to tokenize an input stream, and then parse a token stream, do yourself a favor and get a firm understanding of those basics. ANTLR is going to be very confusing without that basic understanding. It’s pretty simple, but very important to understand.
You can do this in your grammar:
grammar Test
#header {
package $your.$package;
import $your.$package.$yourExceptionClass;
}
exampleParserRule : [a-z]+ ;
catch [RecognitionException re] {
reportError(re);
recover(input,re);
retval.tree = (CommonTree)adaptor.errorNode(input, retval.start, input.LT(-1), re);
String msg = getErrorMessage(re, this.getTokenNames());
throw new $yourExceptionClass(msg, re);
}
It's up to you if you really want to reportError(logs to console) , recover etc. - but these are the defaults so it may be good to use these.
Also, you may want to generate a more human readable error message (use getErrorMessage.
If you do more complex work follow #mike-cargal`s advice.

Parser grammar recognized by ANTLR 4.4 produces lexer syntax errors with ANTLR 4.6 and newer ANTLR versions

I have a scannerless security markings conversion grammar that generates code correctly and runs fine using antlr-4.4-complete.jar. But when I upgrade to antlr4-4.6-complete.jar or newer, code generation fails with "error(50): <.g4 file path>::: syntax error: mismatched character ':' expecting '{'" and other errors.
What changed in ANTLR v4.6 (or possibly v4.5 as I haven't tried that version) that would cause its lexer to err on grammars recognized by v4.4?
Sorry I can't provide a grammar snippit here, but access to the code is restricted.
Turns out newer versions of ANTLR (v4.5 and beyond) will choke on lexing a user-defined rule named channels containing a semantic predicate. ANTLR v4.4 was perfectly happy to lex, parse and generate valid Java code for same. I changed my rule name to channelz, and the grammar now produces code with all ANTLR versions through the 4.9.3 snapshot. Unfortunately, the parser code generated by ANTLR v4.7 and beyond contains numerous other errors which are still to be addressed.
You can view the changes by opening the page https://github.com/antlr/antlr4/releases/tag/VERSION, where VERSION is the version number you're interested in.
So for 4.5 that'd be: https://github.com/antlr/antlr4/releases/tag/4.5

Different between syntax error and parse error

I am wondering that is there any different between parse error and syntax error ? if so can anyone please tell me what is the different ?
thanks
The way I understand it is that a parse error happens because of a syntax error. You (the developer) write code that contains a 'syntax error'. When that code is compiled, the compiler tries to parse your code but cannot which results in a parse error.
If you are dealing with an interpreted language, (PHP, ASP, etc.) the parse error happens when the code is run because it is compiled and run at the same time.

Antlr generates broken C# code, missing variable name in declaration

I have an grammar for a template language.
I created this for Antlr 3.2 and CSharp2 target and have it working.
Now I try to change to antlr 3.4 and CSharp3 target (have tried CSharp2 also) and I get a strange error in the Parser in a synpred function.
Several variable declarations are missing the variable name:
IToken = default(IToken)
Some also have th wrong type
void = default(void);
should be
AstParserRuleReturnScope<CommonTree, IToken> = default(AstParserRuleReturnScope<CommonTree, IToken>);
Have any one seen this before and what could be causing this.
The grammar is the same that was working before.
Unfortunately I cannot share the grammar and I have not had time to create a test grammar that causes the same error.
I can of course fix the errors manually and the code works but it's a bit tiresome to have to go through the code after generation fixing them.
I was able to resolve this problem by using the native .NET version of the ANTLR code generation tool (Antlr3.exe) instead of the Java version. Specifically, antlr-dotnet-tool-3.4.1.9004.7z worked for me, whereas antlr-3.4-complete-no-antlrv2.jar did not.

ANTLR ignoring AST operators

I am using the ^ and ! operators to set the root node and to not not be included in the AST, respectively. However, it is not making a difference in the tree that is generated by ANTLRWorks. So I am not sure if my grammar is incorrect or if ANTLRWorks just isn't creating a correct tree.
Here is an snippet of my grammar
expr
: '('! logExpr ')'!;
These parenthesis should not be included in the AST.
addExpr
: multExpr ( (PLUS|MINUS)^ multExpr )*;
The PLUS or MINUS should be the root node in the AST.
However neither of these things are happening the way I expect them to. It makes no difference when I remove them or put them back. ANTLRWorks 1.4.3 ANTLR 3.4
ANTLRWorks' interpreter shows the parse tree, even though you've put output=AST; in the grammar's options-block, as you already found yourself.
However, ANTLRWorks' debugger does show the AST. To activate the debugger, press CTL+D or select Run->Debug from the menu bar.
The debugger is able to visualize the parse tree and AST. Note that it will not handle embedded code other than Java.
I found the answer. ANTLRWorks shows the parse tree, not the AST. It ignores all rewrites. This led to my second question which was, how do I visualize the AST if ANTLRWorks doesn't do it for me? The answer for that is found here.