I've built my lexer and parser in ANTLR and they work really well in the sense that when user code fails to parse, it outputs useful error messages to STDERR, showing the exact line no. etc.
The problem is, I need to extract this information in order to display the error messages in my Eclipse editor at the correct positions, but it doesn't seem to be available anywhere except on STDERR. I'm basically looking for some kind of myParser.getErrorMessages().
Has anybody come across a solution to this?
I found the below link, however this only works if the user code partially parses (i.e. so we still get an AST). When it fails completely, you don't get a tree back.
http://tech.puredanger.com/2007/02/01/recovering-line-and-column-numbers-in-your-antlr-ast/
I also found this exact question in the official ANTLR FAQ... but I really don't understand his solution. Can anybody translate it for me? I'm not using any of the classes he refers to, and he's talking about v4 (which isn't released yet).
http://www.antlr.org/wiki/display/ANTLR3/Pattern+for+returning+errors+from+ANTLR+in+data+structures%2C+not+STDERR
My code looks as follows:
FileInputStream fis = new FileInputStream("UserCode.txt");
ANTLRInputStream input = new ANTLRInputStream(fis);
MyLexer lexer = new MyLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyParser parser = new MyParser(tokens);
CommonTree tree = (CommonTree)parser.flow().getTree();
MyAST ast = new MyAST(tree);
See: http://www.antlr.org/wiki/display/ANTLR3/Error+reporting+and+recovery (not sure if the examples are fully compatible with ANTLR v3.2/v3.3, but if not, there shouldn't be too many changes to get it working)
Related
Short: I am looking for a way to get the text of the script that was evaluated and caused a syntax error from within the context of window.onerror.
Long:
The full scenario includes a phone gap application and the PushNotifications plugins.
When a push message is sent to the device a javascript error is caught using window.onerror.
with the text "SyntaxtError: Expected token '}'"
the reported line number is 1 (is it is usually when dealing with EVALuated code.
The way the plugin executs its code is by using:
NSString * jsCallBack = [NSString stringWithFormat:#"%#(%#);", self.callback, jsonStr];
[self.webView stringByEvaluatingJavaScriptFromString:jsCallBack];
I belive but not 100% sure that this is the code PhoneGap Build are pushing
more code can be seen in here https://github.com/phonegap-build/PushPlugin/blob/master/src/ios/PushPlugin.m#L177
the self.callback is a string passed by me to the plugin and jsonStr is (supposed to be) an object describing the push message.
when I tried to pass as the parameter that ends up being self.callback the string alert('a');// then I did get the alert and no syntax error. ad now I am trying to understand what does jsonStr gets evaluated to so that maybe I can find a way around it or figure out if its my fault somehow (maybe for the content I am sending in the push notification....)
I also tried to look at the last item of the $('script') collection of the document hopeing that maybe stringByEvaluatingJavaScriptFromString generates a new script block but that does not seem to be the case.
further more in the window.onerror I also tried to get the caller
using var c=window.onerror.caller||window.onerror.arguments.caller; but this returns undefined.
As I stated before - I am looking for ideas on how to determine what exactly is causing the syntax error possibly by getting a hold of the entire block of script being evaluated when the syntax error happened.
I know how to get relevant highlighted fragments together with some surrounding text using Lucene highlighter, namely, using
Highlighter highlighter = new Highlighter(scorer);
String[] fragments = highlighter.getBestFragments(stream, fieldContents, fragmentNumber);
But can I instead get pointers to these fragments in the original contents? In other words, I need to know where these fragments start and, if possible, end.
If you use the GetBestTextFragments method instead, you will get back an array of TextFragments. These have properties textStartPos and textEndPos.
(They are marked internal in Lucene.NET, which will require you to make some code changes to get access to them. I'm not sure about Java Lucene.)
I'm going through the ParseKit example and trying to modify it to suit my needs and running into this problem. As soon as I pass in the grammar file to parserFromGrammar:assembler, I get an error:
[__NSArrayM objectAtIndex:]: index 0 beyond bounds for empty array
I thought maybe it was because my grammar files had token names with underscores in them. Does ParseKit support underscores? What would the method name be that gets called back? Aka would the token name "foo_bar" call a method didMatchFoo_bar?
I then took out all the underscored names and it still gives me that error. I'm using the example grammar file from the ParseKit website:
#start = sentence+;
sentence = adjectives 'beer' '.';
adjectives = cold adjective*;
adjective = cold | freezing;
cold = 'cold';
freezing = 'freezing';
Thanks
Developer of ParseKit here. 2 things:
To answer your first question, I believe the answer is YES.
I just tried out the grammar and it seems to work for me. However, I am using the latest version of ParseKit from Google Code (not GitHub. GitHub is out of date. sorry.)
So checkout ParseKit from Google Code here:
https://parsekit.googlecode.com/svn/trunk
And then select the "DebugApp" target and "DebugApp" Executable and run.
In the Xcode project, do a global search for "cold freezing beer". you'll see I've added your example as the default example run in DebugApp. Seems to work ok.
I'm new to yacc/lex and I'm working on a parser that was written by someone else. I notice that when an undefined token is found, the parser returns an error and stops. Is there a simple way to just make it ignore completely lines that it cannot parse and just move on to the next one?
just add a rule that looks like
. {
// do nothing
}
at the bottom of all of your rules, and it will just ignore everything it comes across that doesn't fit any of the previous rules.
Edit: if you have multiple states, then a catch-all that works in any state would then look like:
<*>. {
}
so, I am parsing Hayes modem AT commands. Not read from a file, but passed as char * (I am using C).
1) what happens if I get something that I totally don't recognize? How do I handle that?
2) what if I have something like
my_token: "cmd param=" ("value_1" | "value_2");
and receive an invalid value for "param"?
I see some advice to let the back-end program (in C) handle it, but that goes against the grain for me. Catch teh problem as early as you can, is my motto.
Is there any way to catch "else" conditions in lexer/parser rules?
Thanks in advance ...
That's the thing: the whole point of your parser and lexer is to blow up if you get bad input, then you catch the blow up and present a pretty error message to the user.
I think you're looking for Custom Syntax Error Recovery to embed in your grammar.
EDIT
I've no experience with ANTLR and C (or C alone for that matter), so follow this advice with caution! :)
Looking at the page: http://www.antlr.org/api/C/using.html, perhaps the part at the bottom, Implementing Customized Methods is what you're after.
HTH