Skip general rule enter/exit listener method in favor of more specific one? (ANTLR4) - antlr

I have generated a grammar in ANTLR4. A sample excerpt is shown below:
list : defunExpr # defun
: lambdaExpr # lambda
: condExpr # cond
...
: items # other
;
The rules are listed in order of priority and are called as appropriate when testing the grammar. All higher priority rules of #defun, #lambda, #cond, etc. would also match items (#other) if they did not match higher up (expected behavior of placing higher-priority rules before lower).
I then implemented a simple listener-based application in Java, which simply formats the parsed code and prints it back out the the console. I have overridden the appropriate enter/exit methods for #defun, #lambda, #cond, etc. I would like to implement a generalized catch-all for items which do not match the more specific rule. However, when I implement enter/exit methods for #other, it executes for every matched rule further up the priority as well, effectively outputting formatted code twice for rules such as #defun, #lambda, #cond, etc.
Is there some way to achieve this behavior? I have a handful of specific rules I want to implement, and then have a general case catch the others. The grammar parses properly (test rig shows expected behavior over numerous test cases), but the catch-all method (enterOther) seems to act upon the specific rules as well.
EDIT: Wow, after all this time and posting this question, I now actually believe it is a grammar error. I will leave the question open until I verify, however.

Thanks for the interest, guys. I'm not evaluating anything, just echoing parsed input, so listeners work fine. Grammar was actually fine, non-ambiguous. The catch-all rule (it was catch-all, despite not showing enough of my grammar here) worked fine. My problem (embarrassingly), was that while I wanted to write enter/exit #other methods, I was actually writing enter/exit Expr methods the whole time, which was why all specific rules we triggered as well (since they are Exprs). Embarrassing, but lesson learned. Thanks for the ideas and taking the time. Cheers!

Related

Throw Custom Exception From ANTLR4 Grammar File

I have a grammar file which parse a specific file type. Now I need a simple thing.
I have a parser rule, and when the parser rule doesn't satisfy with the input token I need to throw my own custom exception.
That is, when my input file is giving me an extraneous error because the parser is expecting something and the i/p file doesn't has that. I want to throw an exception in this scenario.
Is this possible ?
If yes, How ?
If no, any work around ?
I'm beginner in this skill.
grammar Test
exampleParserRule : [a-z]+ ;
My input file contains 12345. Now I need to throw a custom exception
For parsing issues such as this, ANTLR will, internally, throw certain exceptions, but they are caught and handled by an ErrorListener. By default, ANTLR will hook up a ConsoleErrorListener that just formats and writes error messages to the console. But it will then continue processing, attempting to recover from the error, and sync back up with your input. This is something you want in a parser. It’s not very useful to have a parser just report the first problem it encounters and then exception out.
You can implement your own ErrorListener (there’s a BaseErrorListener class you can subclass). There you can handle the reported error yourself (the method you override provide a lot of detailed information about the error) and produce whatever message you’d like. (You can also do things like collect all the errors in a list, keep track of error levels, etc.)
In short, you probably don’t want a different exception, you want a different error message.
Sometimes, depending on how difficult it is to sort out your particular situation for a custom message, it’s really better to look for it in a Listener that processes the parse tree that ANTLR hands back. (Note: a very common path beginners take is to try to get everything into the grammar. It’s going to be hard to get really nice error messages if you do this. ANTLR is pretty good with error messages, but, as a generalized tool, it’s just more likely you can produce more meaningful messages.)
Just try to get ANTLR to produce a parse tree that accurately reflects the structure of your input. Then you can walk the ParseTree with a validation Listener of your own code, producing your own messages.
Another “trick” that doesn’t occur to many ANTLR devs (early on), is that the grammar doesn’t HAVE to ONLY include rules for valid input. If there’s some particular input you want to give a more helpful error message for, you can add a rule to match that (invalid) input, and when you encounter that Context in your validation listener, generate an error message specific to that construct.
BTW… [a-z]+ would almost always be a Lexer rule. If you don’t yet understand the difference between lexer and parser rules, or the processing pipeline ANTLR uses to tokenize an input stream, and then parse a token stream, do yourself a favor and get a firm understanding of those basics. ANTLR is going to be very confusing without that basic understanding. It’s pretty simple, but very important to understand.
You can do this in your grammar:
grammar Test
#header {
package $your.$package;
import $your.$package.$yourExceptionClass;
}
exampleParserRule : [a-z]+ ;
catch [RecognitionException re] {
reportError(re);
recover(input,re);
retval.tree = (CommonTree)adaptor.errorNode(input, retval.start, input.LT(-1), re);
String msg = getErrorMessage(re, this.getTokenNames());
throw new $yourExceptionClass(msg, re);
}
It's up to you if you really want to reportError(logs to console) , recover etc. - but these are the defaults so it may be good to use these.
Also, you may want to generate a more human readable error message (use getErrorMessage.
If you do more complex work follow #mike-cargal`s advice.

Is Cmd.map the right way to split an Elm SPA into modules?

I'm building a single page application in Elm and was having difficulty deciding how to split my code in files.
I ended up splitting it using 1 module per page and have Main.elm convert the Html and Cmd emitted by each page using Cmd.map and Html.map.
My issue is that the documentation for both Cmd.map and Html.map says that :
This is very rarely useful in well-structured Elm code, so definitely read the section on structure in the guide before reaching for this!
I checked the only 2 large apps I'm aware of :
elm-spa-example uses Cmd.map (https://github.com/rtfeldman/elm-spa-example/blob/cb32acd73c3d346d0064e7923049867d8ce67193/src/Main.elm#L279)
I was not able to figure out how https://github.com/elm/elm-lang.org
deals with the issue.
Also, both answers to this stackoverflow question suggest using Cmd.map without second thoughts.
Is Cmd.map the "right" way to split a single page application in modules ?
I think sometimes you just have to do what's right for you. I used the Cmd.map/Sub.map/Html.map approach for an application I wrote that had 3 "pages" - Initializing, Editing and Reporting.
I wanted to make each of these pages its own module as they were relatively complicated, each had a fair number of messages that are only relevant to each page, and it's easier to reason about each page independently in its own context.
The downside is that the compiler won't prevent you from receiving the wrong message for a given page, leading to a runtime error (e.g., if the application receives an Editing.Save when it is in the Reporting page, what is the correct behavior? For my specific implementation, I just log it to the console and move on - this was good enough for me (and it never happened anyway); Other options I've considered include displaying a nasty error page to indicate that something horrible has happened - a BSOD if you will; Or to simply reset/reinitialize the entire application).
An alternative is to use the effect pattern as described extensively in this discourse post.
The core of this approach is that :
The extended Effect pattern used in this application consists in definining an Effect custom type that can represent all the effects that init and update functions want to produce.
And the main benefits :
All the effects are defined in a single Effect module, which acts as an internal API for the whole application that is guaranteed to list every possible effect.
Effects can be inspected and tested, not like Cmd values. This allows to test all the application effects, including simulated HTTP requests.
Effects can represent a modification of top level model data, like the Session 3 when logging in 3, or the current page when an URL change is wanted by a subpage update function.
All the update functions keep a clean and concise Msg -> Model -> ( Model, Effect Msg ) 2 signature.
Because Effect values carry the minimum information required, some parameters like the Browser.Navigation.key are needed only in the effects perform 3 function, which frees the developer from passing them to functions all over the application.
A single NoOp or Ignored String 25 can be used for the whole application.

Using ANTLR4 lexing for Code Completion in Netbeans Platform

I am using ANTLR4 to parse code in my Netbeans Platform application. I have successfully implemented syntax highlighting using ANTLR4 and Netbeans mechanisms.
I have also implemented a simple code completion for two of my tokens. At the moment I am using a simple implementation from a tutorial, which searches for a whitespace and starts the completion process from there. This works, but it deems the user to prefix a whitespace before starting code completion.
My question: is it possible or even contemplated using ANTLR's lexer to determine which tokens are currently read from the input to determine the correct completion item?
I would appreciate every pointer in the right direction to improve this behaviour.
not really an answer, but I do not have enough reputation points to post comments.
is it possible or even contemplated using ANTLR's lexer to determine which tokens are currently read from the input to determine the correct completion item?
Have a look here: http://www.antlr3.org/pipermail/antlr-interest/2008-November/031576.html
and here: https://groups.google.com/forum/#!topic/antlr-discussion/DbJ-2qBmNk0
Bear in mind that first post was written in 2008 and current antlr v4 is very different from the one available at the time, which is why Sam’s opinion on this topic appear to have evolved.
My personal experience - most of what you are asking is probably doable with antlr, but you would have to know antlr very well. A more straightforward option is to use antlr to gather information about the context and use your own heuristics to decide what needs to be shown in this context.
The ANTLRv3 grammar https://sourceware.org/git/?p=frysk.git;a=blob_plain;f=frysk-core/frysk/expr/CExpr.g;hb=HEAD implements context sensitive completion of C expressions (no macros).
For instance, if fed the string:
a_struct->a<tab>
it would just lists the fields of "a_struct" starting with "a" (tab could, technically be any character or marker).
The technique it used was to:
modify a C grammar to recognize both IDENT and IDENT_TAB tokens
for IDENT_TAB capture the partial expression AST and "TOKEN_TAB" and throw them back to 'main' (there are hacks to help capture the AST)
'main' then performs a type-eval on the partial expression (compute the expression's type not value) and use that to expand TOKEN_TAB
the same technique, while not exactly ideal, can certainly be used in ANTLRv4.

How would you effectively test command line software, with many switches and arguments

A command line utility/software could potentially consist of many different switches and arguments.
Lets say your software is called CLI and lets say CLI has the following features:
The general syntax of CLI is:
CLI <data structures> <operation> <required arguments> [optional arguments]
<data structures> could be 'matrix', 'complex numbers', 'int', 'floating point', 'log'
<operation> could be 'add', 'subtract', 'multiply', 'divide'
I cant think of any required and optional arguments, but lets say your software does support it
Now you want to test this software. And you wish to test interface itself, not the logic. Essentially the interface must return the correct success codes and error codes.
Essentially a lot of real word software still present a Command Line interface with several options. I am curious if there is any formal testing methodology established for this. One idea i had was to construct a grammar (like EBNF) and describing the 'language' of the interface. But I fail to push this idea ahead. What good is a grammar for in this case? How does it enable the generation of many many combinations .
I am curious to learn more about any theoretical models which could be applied to such a problem or if anyone in here has actually done such testing with satisfying coverage
There is a command-line tool as part of a product i maintain, and i have a situation thats very similar to what you describe. What i did was employ a unit testing framework, and encode each combination of arguments as a test method.
The program is implemented in c#/.NET, so i use microsoft's testing framework that's builtin to Visual Studio, but the approach would work with any unit testing framework.
Each test invokes a utility function that starts the process and sends in the input and cole ts the output. Then, each test is responsible for verifying that the output from the CLI matches what was expected. In some cases, there's a family of test cases that can be performed by a single test method, wih a for loop in it. The logic needs to run the CLI and check the output for each iteration.
The set of tests i have does not cover every permutation of arguments, but it covers the 80% cases and i can add new tests if there are ever any defects.
Using a recursive grammar to generate switches is an interesting idea. If you where to try this then you would need to first write the grammar in such a way that all switches could be used, and then do a random walk of the grammar.
This provides an easy method of randomly walking a grammar and outputting the result.

How can you get FxCop rule CA1726 to ignore a preferred term?

FxCop has a rule (CA1726) that checks for preferred terms. This looks for words like "Dont" and tells you to replace them with better words like "Do not". Generally this is fine, however one of the terms it objects to is "Flag". At our firm, the business deals with Flags meaning those cloth things at the end of flagpoles. Suppressing this rule each time is becoming a pain. Does anyone know a way to get this rule to work on everything except "Flag"?
Note: I know I can turn the rule off completly, but I don't want to do that. I just want to turn off part of the rule.
I have answered my own question.
It turns out that the list of preferred terms is listed in the CustomDictionary.xml file that is in the FxCop install directory (C:\Program Files\Microsoft FxCop 1.36\CustomDictionary.xml). There is a section <Dictionary><Words><Deprecated> that contains a number of <Term> elements. Simply removing the ones I don't want has done the trick.