Parser grammar recognized by ANTLR 4.4 produces lexer syntax errors with ANTLR 4.6 and newer ANTLR versions - syntax-error

I have a scannerless security markings conversion grammar that generates code correctly and runs fine using antlr-4.4-complete.jar. But when I upgrade to antlr4-4.6-complete.jar or newer, code generation fails with "error(50): <.g4 file path>::: syntax error: mismatched character ':' expecting '{'" and other errors.
What changed in ANTLR v4.6 (or possibly v4.5 as I haven't tried that version) that would cause its lexer to err on grammars recognized by v4.4?
Sorry I can't provide a grammar snippit here, but access to the code is restricted.

Turns out newer versions of ANTLR (v4.5 and beyond) will choke on lexing a user-defined rule named channels containing a semantic predicate. ANTLR v4.4 was perfectly happy to lex, parse and generate valid Java code for same. I changed my rule name to channelz, and the grammar now produces code with all ANTLR versions through the 4.9.3 snapshot. Unfortunately, the parser code generated by ANTLR v4.7 and beyond contains numerous other errors which are still to be addressed.

You can view the changes by opening the page https://github.com/antlr/antlr4/releases/tag/VERSION, where VERSION is the version number you're interested in.
So for 4.5 that'd be: https://github.com/antlr/antlr4/releases/tag/4.5

Related

What does LFNO and LF stands for?

I am parsing Java with Antlr, using ready Java9 grammar from antlr-grammars repo.
In the Antlr contexts I often see methods with "lfno" or "lf" suffics in the name, for instance:
classInstanceCreationExpression_lfno_primary
arrayAccess_lfno_primary,
or even like this - PrimaryNoNewArray_lfno_primary_lf_arrayAccess_lfno_primaryContext
I wonder what does that mean, because I coundn't find any information on that
Any ideas would be appreciated

ANTLR v4 plugin for Intellij IDEA: have to restart IJ after changing lexer grammar

I try to use an ANTLR plugin for IJ, but there is an annoying problem. I don't know, what I'm doing wrong, but after changing something in lexer grammar besides generating ANTLR recognizer (often, but not always) I have to restart IJ to see the correct parsing tree. Already tried to "Save all" or "Synchronize" before testing parser, but nothing helps. Has anyone encountered such a problem?
Thank you in advance.
As glytching suggested in their comment, this is the problem described in this GitHub issue: https://github.com/antlr/intellij-plugin-v4/issues/242
The solution seems to be to hit Save. Another user also mentions touching the file from the terminal.
This puzzled me, as I'm using PyCharm and the way it seems to be set up by default is to auto-save as you work so I basically never interact with Save explicitly in any way. However, in this case hitting Ctrl+S does seem to make a difference (compared with just letting it auto-save) and it solves the issue for me.
For clarity, my situation is:
I have a grammar broken up into several parts (mixture of lexer, parser and combined grammars) which are imported into the 'main' grammar.
I am working interactively with the ANTLR Preview window (OP mentioned generating a recognizer, but I think this issue is completely independent of running the Antlr generator).
If I make a change in one of the imported grammars and switch back to the main grammar to re-run my start rule there it doesn't always pick up the change from the imported grammar.
Hitting Ctrl+S after making the change in the imported grammar, before switching back to my main grammar, fixes the problem.

Using ANTLR4 lexing for Code Completion in Netbeans Platform

I am using ANTLR4 to parse code in my Netbeans Platform application. I have successfully implemented syntax highlighting using ANTLR4 and Netbeans mechanisms.
I have also implemented a simple code completion for two of my tokens. At the moment I am using a simple implementation from a tutorial, which searches for a whitespace and starts the completion process from there. This works, but it deems the user to prefix a whitespace before starting code completion.
My question: is it possible or even contemplated using ANTLR's lexer to determine which tokens are currently read from the input to determine the correct completion item?
I would appreciate every pointer in the right direction to improve this behaviour.
not really an answer, but I do not have enough reputation points to post comments.
is it possible or even contemplated using ANTLR's lexer to determine which tokens are currently read from the input to determine the correct completion item?
Have a look here: http://www.antlr3.org/pipermail/antlr-interest/2008-November/031576.html
and here: https://groups.google.com/forum/#!topic/antlr-discussion/DbJ-2qBmNk0
Bear in mind that first post was written in 2008 and current antlr v4 is very different from the one available at the time, which is why Sam’s opinion on this topic appear to have evolved.
My personal experience - most of what you are asking is probably doable with antlr, but you would have to know antlr very well. A more straightforward option is to use antlr to gather information about the context and use your own heuristics to decide what needs to be shown in this context.
The ANTLRv3 grammar https://sourceware.org/git/?p=frysk.git;a=blob_plain;f=frysk-core/frysk/expr/CExpr.g;hb=HEAD implements context sensitive completion of C expressions (no macros).
For instance, if fed the string:
a_struct->a<tab>
it would just lists the fields of "a_struct" starting with "a" (tab could, technically be any character or marker).
The technique it used was to:
modify a C grammar to recognize both IDENT and IDENT_TAB tokens
for IDENT_TAB capture the partial expression AST and "TOKEN_TAB" and throw them back to 'main' (there are hacks to help capture the AST)
'main' then performs a type-eval on the partial expression (compute the expression's type not value) and use that to expand TOKEN_TAB
the same technique, while not exactly ideal, can certainly be used in ANTLRv4.

Antlr generates broken C# code, missing variable name in declaration

I have an grammar for a template language.
I created this for Antlr 3.2 and CSharp2 target and have it working.
Now I try to change to antlr 3.4 and CSharp3 target (have tried CSharp2 also) and I get a strange error in the Parser in a synpred function.
Several variable declarations are missing the variable name:
IToken = default(IToken)
Some also have th wrong type
void = default(void);
should be
AstParserRuleReturnScope<CommonTree, IToken> = default(AstParserRuleReturnScope<CommonTree, IToken>);
Have any one seen this before and what could be causing this.
The grammar is the same that was working before.
Unfortunately I cannot share the grammar and I have not had time to create a test grammar that causes the same error.
I can of course fix the errors manually and the code works but it's a bit tiresome to have to go through the code after generation fixing them.
I was able to resolve this problem by using the native .NET version of the ANTLR code generation tool (Antlr3.exe) instead of the Java version. Specifically, antlr-dotnet-tool-3.4.1.9004.7z worked for me, whereas antlr-3.4-complete-no-antlrv2.jar did not.

ANTLR ignoring AST operators

I am using the ^ and ! operators to set the root node and to not not be included in the AST, respectively. However, it is not making a difference in the tree that is generated by ANTLRWorks. So I am not sure if my grammar is incorrect or if ANTLRWorks just isn't creating a correct tree.
Here is an snippet of my grammar
expr
: '('! logExpr ')'!;
These parenthesis should not be included in the AST.
addExpr
: multExpr ( (PLUS|MINUS)^ multExpr )*;
The PLUS or MINUS should be the root node in the AST.
However neither of these things are happening the way I expect them to. It makes no difference when I remove them or put them back. ANTLRWorks 1.4.3 ANTLR 3.4
ANTLRWorks' interpreter shows the parse tree, even though you've put output=AST; in the grammar's options-block, as you already found yourself.
However, ANTLRWorks' debugger does show the AST. To activate the debugger, press CTL+D or select Run->Debug from the menu bar.
The debugger is able to visualize the parse tree and AST. Note that it will not handle embedded code other than Java.
I found the answer. ANTLRWorks shows the parse tree, not the AST. It ignores all rewrites. This led to my second question which was, how do I visualize the AST if ANTLRWorks doesn't do it for me? The answer for that is found here.