How to create suggestion messages with ANTLR? - antlr

I want to create an interactive version of the ANTLR calculator example, which tells the user what to type next. For instance, in the beginning, the ID, INT, NEWLINE, and WS tokens are possible. Ignoring WS, a suggestion message could be:
Type an identifier, a number, or newline.
After parsing a number, the message should be
Type +, -, *, or newline.
and so on. How to do this?
Edit
What I have tried so far:
private void accept(String sentence) {
ANTLRInputStream is = new ANTLRInputStream(sentence);
OperationLexer l = new OperationLexer(is);
CommonTokenStream cts = new CommonTokenStream(l);
final OperationParser parser = new OperationParser(cts);
parser.addParseListener(new OperationBaseListener() {
#Override
public void enterEveryRule(ParserRuleContext ctx) {
ATNState state = parser.getATN().states.get(parser.getState());
System.out.print("RULE " + parser.ruleNames[state.ruleIndex] + " ");
IntervalSet following = parser.getATN().nextTokens(state, ctx);
for (Integer token : following.toList()) {
System.out.print(parser.tokenNames[token] + " ");
}
System.out.println();
}
});
parser.prog();
}
prints the right suggestion for the first token, but for all other tokens, it print the current token. I guess capturing the state at enterEveryRule() is too early.

Accurately gathering this information in an LL(k) parser, where k>1, requires a thorough understanding of the parser internals. Several years ago, I faced this problem with ANTLR 3, and found the only real solution was so complex that it resulted in me becoming a co-author of ANTLR 4 specifically so I could handle this issue.
ANTLR (including ANTLR 4) disambiguates the parse tree during the parsing phase, which means if your grammar is not LL(1) then performing this analysis in the parse tree means you have already lost information necessary to be accurate. You'll need to write your own version of ParserATNSimulator (or a custom interpreter which wraps it) which does not lose the information.

Related

Is there a way to edit nodes on an Antlr ParseTree?

I am recursively traversing an antlr parse tree and I want to edit the text of TerminalNodes in the tree. I want to be able to do this for any ParseTree and I don't want to write a specific Visitor for each ParseTree I may encounter.
I have looked through The Definitive ANTLR4 Reference and seen that antlr doesn't have any direct support for tree rewriting. I am looking for any possible workarounds or alternative solutions.
private void editTree(ParseTree tree){
for(int i = 0; i < tree.getChildCount();i++){
ParseTree child = tree.getChild(i);
if(child instanceof TerminalNode){
//Edit child's text
} else {
editTree(child);
}
}
}
TerminalNode has a member getSymbol(), which returns the lexed token. This is usually a CommonToken instance, which allows to set the text and other properties like line number, type etc. ParseTree.getText() does nothing else but asking the symbol to provide the text (which in turn is what you can set or what comes from the input stream).

After Antlr4 recognize error, how to ask the application to automatically fix it?

We know Antlr4 is using the sync-and-return recovery mechanism. For example, I have the following simple grammar:
grammar Hello;
r : prefix body ;
prefix: 'hello' ':';
body: INT ID ;
INT: [0-9]+ ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
I use the following listener to grab the input:
public class HelloLoader extends HelloBaseListener {
String input;
public void exitR(HelloParser.RContext ctx) {
input = ctx.getText();
}
}
The main method in my HelloRunner looks like this:
public static void main(String[] args) throws IOException {
CharStream input = CharStreams.fromStream(System.in);
HelloLexer lexer = new HelloLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
HelloParser parser = new HelloParser(tokens);
ParseTree tree = parser.r();
ParseTreeWalker walker = new ParseTreeWalker();
HelloLoader loader = new HelloLoader();
walker.walk(loader, tree);
System.out.println(loader.input);
}
Now if I enter a correct input "hello : 1 morning", I will get hello:1morning, as expected.
What if an incorrect input "hello ; 1 morning"? I will get the following output:
line 1:6 token recognition error at: ';'
line 1:8 missing ':' at '1'
hello<missing ':'>1morning
It seems that Antlr4 automatically recognized a wrong token ";" and delete it; however, it will not smartly add ":" in the corresponding place, but just claim <missing ':'>.
My question is: is there some way to solve this problem so that when Antlr found an error it will automatically fix it? How to achieve this coding? Do we need other tools?
Typically the input for a parser comes from some source file that contains some code or text that (supposedly) conforms to some grammar. A typical use scenario for syntax errors is to alert the user so that the source file can be corrected.
As the commented noted, you can insert your own error recovery system, but before trying to insert a single token into the token stream and recover, please consider that it would be a very limited solution. Why? Consider a much richer grammar where for a given token, many -- perhaps dozens or hundreds -- of other tokens can legally follow it. How would a single-token replacement strategy work then?
The hello.g4 example is the epitome of a trivial grammar, the "hello world" of ANTLR. But most of the time, for non-trivial grammars, the best we can do with imperfect syntax is to simply alert the user so the syntax can be corrected.

Getting a return value from Antlr Listener?

i have a Antlr generated Listener, and i call my tree walker to go through the tree from a parse function in another class. Looks like this:
public double calculate(){
ANTLRInputStream input = new ANTLRInputStream("5+2");
Lexer lexer = new Lexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
Parser parser = new Parser(tokens);
ParseTree tree = parser.calculate();
ParseTreeWalker walker = new ParseTreeWalker();
walker.walk(new Listener(), tree);
return 0;
}
So the listener works perfect with the enter() and quit() Functions and prints the correct value in the end:
public void exitParser(ParserContext ctx) {
result = stack.peek();
System.out.println(result);
}
But i wanna receive the final value in my calculate() function to return it there. Since exitParser(...) is void i dont know how to deal with it.
With the visitor i was able to do it like that:
public double calculate(){
// ...
String value = new WRBVisitor().visit(tree);
return Double.parseDouble(value);
}
Hope someone understands my problem and knows a solution for it.
Best regards
As mentioned in the comments: a visitor might be a better option in your case. A visitor's methods will always return a value, which is what you seem to be after. That could be a Double if your expressions always evaluate to a numeric value, or some sort of home-grown Value that could represent a Double, Boolean, etc.
Have a look at my demo expression evaluator (using a visitor) on GitHub: https://github.com/bkiers/Mu

How do I execute Dynamically (like Eval) in Dart?

Since getting started in Dart I've been watching for a way to execute Dart (Text) Source (that the same program may well be generating dynamically) as Code. Like the infamous "eval()" function.
Recently I have caught a few hints that the communication port between Isolates support some sort of "Spawn" that seems like it could allow this "trick". In Ruby there is also the possibility to load a module dynamically as a language feature, perhaps there is some way to do this in Dart?
Any clues or a simple example will be greatly appreciated.
Thanks in advance!
Ladislav Thon provided this answer on the Dart forum:
I believe it's very safe to say that Dart will never have eval. But it will have other, more structured ways of dynamically generating code (code name mirror builders). There is nothing like that right now, though.
There are two ways of spawning an isolate: spawnFunction, which runs an existing function from the existing code in a new isolate, so nothing you are looking for, and spawnUri, which downloads code from given URI and runs it in new isolate. That is essentially dynamic code loading -- but the dynamically loaded code is isolated from the existing code. It runs in a new isolate, so the only means of communicating with it is via message passing (through ports).
You can run a string as Dart code by first constructing a data URI from it and then passing it into Isolate.spawnUri.
import 'dart:isolate';
void main() async {
final uri = Uri.dataFromString(
'''
void main() {
print("Hellooooooo from the other side!");
}
''',
mimeType: 'application/dart',
);
await Isolate.spawnUri(uri, [], null);
}
Note that you can only do this in JIT mode, which means that the only place you might benefit from it is Dart VM command line apps / package:build scripts. It will not work in Flutter release builds.
To get a result back from it, you can use ports:
import 'dart:isolate';
void main() async {
final name = 'Eval Knievel';
final uri = Uri.dataFromString(
'''
import "dart:isolate";
void main(_, SendPort port) {
port.send("Nice to meet you, $name!");
}
''',
mimeType: 'application/dart',
);
final port = ReceivePort();
await Isolate.spawnUri(uri, [], port.sendPort);
final String response = await port.first;
print(response);
}
I wrote about it on my blog.
Eval(), in Ruby at least, can execute anything from a single statement (like an assignment) to complete involved programs. There is a substantial time penalty for executing many small snippets over most any other form of execution that is possible.
Looking at the problem closer, there are at least three different functions that were at the base of the various schemes where eval might be used. Dart handles at least 2 of these in at least minimal ways.
Dart does not, nor does it look like there is any plan to support "general" script execution.
However, the NoSuchMethod method can be used to effectively implement the dynamic "injection" of variables into your local class environment. It replaces an eval() with a string that would look like this: eval( "String text = 'your first name here';" );
The second function that Dart readily supports now is the invocation of a method, that would look like this: eval( "Map map = SomeClass.some_method()" );
After messing about with this it finally dawned on me that a single simple class can be used to store the information needed to invoke a method, for a class, as a string which seems to have general utility. I can replace a big maintenance prone switch statement that might otherwise be necessary to invoke a series of methods. In Ruby this was almost trivial, however in Dart there are some fairly less than intuitive calls so I wanted to get this "trick" in one place, which fits will with doing ordering and filtering on the strings such as you may need.
Here's the code to "accumulate" as many classes (a whole library?) into a map using reflection such that the class.methodName() can be called with nothing more than a key (as a string).
Note: I used a few "helper methods" to do Map & List functions, you will probably want to replace them with straight Dart. However this code is used and tested only using the functions..
Here's the code:
//The used "Helpers" here..
MAP_add(var map, var key, var value){ if(key != null){map[key] = value;}return(map);}
Object MAP_fetch(var map, var key, [var dflt = null]) {var value = map[key];if (value==null) {value = dflt;}return( value );}
class ClassMethodMapper {
Map _helperMirrorsMap, _methodMap;
void accum_class_map(Object myClass){
InstanceMirror helperMirror = reflect(myClass);
List methodsAr = helperMirror.type.methods.values;
String classNm = myClass.toString().split("'")[1]; ///#FRAGILE
MAP_add(_helperMirrorsMap, classNm, helperMirror);
methodsAr.forEach(( method) {
String key = method.simpleName;
if (key.charCodeAt(0) != 95) { //Ignore private methods
MAP_add(_methodMap, "${classNm}.${key}()", method);
}
});
}
Map invoker( String methodNm ) {
var method = MAP_fetch(_methodMap, methodNm, null);
if (method != null) {
String classNm = methodNm.split('.')[0];
InstanceMirror helperMirror = MAP_fetch(_helperMirrorsMap, classNm);
helperMirror.invoke(method.simpleName, []);
}
}
ClassMethodMapper() {
_methodMap = {};
_helperMirrorsMap = {};
}
}//END_OF_CLASS( ClassMethodMapper );
============
main() {
ClassMethodMapper cMM = new ClassMethodMapper();
cMM.accum_class_map(MyFirstExampleClass);
cMM.accum_class_map(MySecondExampleClass);
//Now you're ready to execute any method (not private as per a special line of code above)
//by simply doing this:
cMM.invoker( MyFirstExampleClass.my_example_method() );
}
Actually there some libraries in pub.dev/packages but has some limitations because are young versions, so that I can recommend you this library expressions to dart and flutter.
A library to parse and evaluate simple expressions.
This library can handle simple expressions, but no blocks of code, control flow statements and so on. It supports a syntax that is common to most programming languages.
There I create an example of code to evaluate arithmetic operations and comparations of data.
import 'package:expressions/expressions.dart';
import 'dart:math';
#override
Widget build(BuildContext context) {
final parsing = FormulaMath();
// Expression example
String condition = "(cos(x)*cos(x)+sin(x)*sin(x)==1) && respuesta_texto == 'si'";
Expression expression = Expression.parse(condition);
var context = {
"x": pi / 5,
"cos": cos,
"sin": sin,
"respuesta_texto" : 'si'
};
// Evaluate expression
final evaluator = const ExpressionEvaluator();
var r = evaluator.eval(expression, context);
print(r);
return Scaffold(
body: Container(
margin: EdgeInsets.only(top: 50.0),
child: Column(
children: [
Text(condition),
Text(r.toString())
],
),
),
);
}
I/flutter (27188): true

Lucene stop phrases filter

I'm trying to write a filter for Lucene, similar to StopWordsFilter (thus implementing TokenFilter), but I need to remove phrases (sequence of tokens) instead of words.
The "stop phrases" are represented themselves as a sequence of tokens: punctuation is not considered.
I think I need to do some kind of buffering of the tokens in the token stream, and when a full phrase is matched, I discard all tokens in the buffer.
What would be the best approach to implements a "stop phrases" filter given a stream of words like Lucene's TokenStream?
In this thread I was given a solution: use Lucene's CachingTokenFilter as a starting point:
That solution was actually the right way to go.
EDIT: I fixed the dead link. Here is a transcript of the thread.
MY QUESTION:
I'm trying to implement a "stop phrases filter" with the new TokenStream
API.
I would like to be able to peek into N tokens ahead, see if the current
token + N subsequent tokens match a "stop phrase" (the set of stop phrases
are saved in a HashSet), then discard all these tokens when they match a
stop phrase, or keep them all if they don't match.
For this purpose I would like to use captureState() and then restoreState()
to get back to the starting point of the stream.
I tried many combinations of these API. My last attempt is in the code
below, which doesn't work.
static private HashSet<String> m_stop_phrases = new HashSet<String>();
static private int m_max_stop_phrase_length = 0;
...
public final boolean incrementToken() throws IOException {
if (!input.incrementToken())
return false;
Stack<State> stateStack = new Stack<State>();
StringBuilder match_string_builder = new StringBuilder();
int skippedPositions = 0;
boolean is_next_token = true;
while (is_next_token && match_string_builder.length() < m_max_stop_phrase_length) {
if (match_string_builder.length() > 0)
match_string_builder.append(" ");
match_string_builder.append(termAtt.term());
skippedPositions += posIncrAtt.getPositionIncrement();
stateStack.push(captureState());
is_next_token = input.incrementToken();
if (m_stop_phrases.contains(match_string_builder.toString())) {
// Stop phrase is found: skip the number of tokens
// without restoring the state
posIncrAtt.setPositionIncrement(posIncrAtt.getPositionIncrement() + skippedPositions);
return is_next_token;
}
}
// No stop phrase found: restore the stream
while (!stateStack.empty())
restoreState(stateStack.pop());
return true;
}
Which is the correct direction I should look into to implement my "stop
phrases" filter?
CORRECT ANSWER:
restoreState only restores the token contents, not the complete stream. So
you cannot roll back the token stream (and this was also not possible with
the old API). The while loop at the end of you code is not working as you
exspect because of this. You may use CachingTokenFilter, which can be reset
and consumed again, as a source for further work.
You'll really have to write your own Analyzer, I should think, since whether or not some sequence of words is a "phrase" is dependent on cues, such as punctuation, that are not available after tokenization.