Systematic way to generate ANTLR tree grammar? - antlr

I have a little bit large ANTLR parser grammar file and want to make a tree grammar for it. But, as far as I know this work of tree grammar generation can't be done automatically, i.e., I should generate it manually by copying parser grammar, removing some unnecessary code, etc. I want to know if there is a systematic way to generate a tree grammar file from a parser grammar file.
P.S. I read an article that insists that 'Manual Tree Walking Is Better Than Tree Grammars'. Is this reliable information? If so, would it be better for me to make a manual tree walker than writing an ANTLR tree grammar file? And then, how do I make a manual tree walker with my ANTLR parser grammar file(it makes an AST using rewrite rules)?
Thanks in advance.

sky wrote:
I want to know if there is a systematic way to generate a tree grammar file from a parser grammar file
You've already described the systematic way to do this: copy the parser/production rules in the tree grammar and only leave the rewrite rules in it. This will probably handle the larger part of your rules, but with other parser rules (using inline AST rewrite rules), it might look slightly different. Because of that, there is no automatic way to generate a tree grammar.
sky wrote:
P.S. I read an article that insists that 'Manual Tree Walking Is Better Than Tree Grammars'. Is this reliable information?
Yes, it is. Note that Terence Parr (creator of ANTLR) posted the article on the ANTLR wiki himself, so that says the author of it (Andy Tripp) raises valid points.
sky wrote:
If so, would it be better for me to make a manual tree walker than writing an ANTLR tree grammar file?
As Andy mentioned in his conclusion: "The decision about whether to use a "Tree Grammar" approach to translation vs. just "doing it by hand" is a matter of taste.". So, if you think writing tree grammar is too much hassle, go the manual way. It's up to you: there is no best way here.
sky wrote:
And then, how do I make a manual tree walker with my ANTLR parser grammar file(it makes an AST using rewrite rules)?
Your parser will create an AST, which by default is of type CommonTree (API-doc). You can use that tree to get the children, the parent, the type of the token etc.: all you need to manually walk the tree.
EDIT
Note that in the next version of ANTLR (version 4) it will (most likely) be possible to automatically generate a tree walker given a combined- or parser grammar.
See:
https://web.archive.org/web/20130620232750/http://www.antlr.org/wiki/display/~admin/ANTLR+v4+plans
https://web.archive.org/web/20130927174157/http://www.antlr.org/wiki/display/~admin/2011/09/05/Auto+tree+construction+and+visitors
https://web.archive.org/web/20130927175520/http://www.antlr.org/wiki/display/~admin/2011/09/08/Sample+v4+generated+visitor

Related

antlr - generate grammar from java source code

I am wondering if I can generate ANTLR grammar from java source code. I want to do some kind of research project, but I am just exploring different open sources to see which one is best.
For ANTLR, do I always have to write a grammar and pass it to the ANTLR?
Is there a way to generate grammar from an existing Java source code?
Not easily. ANTLR generate a recursive descent parser from your grammar, encoding the tests into procedural code, as well as lots of other bookkeeping stuff.
Knowing how the code is generated, you might be able to take it apart but you'll have to reach into the middle of generated statements and that isn't easy without a full parser for the generated language. (Hint: regex won't work).
I don't see a lot of point of this exercise. Why don't you just use the original grammar?

Generate source code from AST with Antlr4 and StringTemplates

If I have an AST and modify it, can I use StringTemplates to generate the source code for the modified AST?
I have successfully implemented my grammar for Antlr4. It generates the AST of a source code and I use the Visitor Class to perform the desired actions. I then modify something in the AST and I would like to generate the source code for that modified AST. (I believe it is called pretty-printing?).
Does Antlr's built in StringTemplates have all the functionality to do this? Where should one start (practical advice is very welcome)?
You can walk the tree and use string templates (or even plain out string prints) to spit out text equivalents that to some extent reproduce the source text.
But you will find reproducing the source text in a realistic way harder to do than this suggests. If you want back code that the original programmer will not reject, you need to:
Preserve comments. I don't think ANTLR ASTs do this.
Generate layout that preserves the original indentation.
Preserve the radix, leading-zero count, and other "format" properties of literal values
Renerate strings with reasonable escapes
Doing all of this well is tricky. See my SO answer How to compile an AST back to source code for more details. (Weirdly, the ANTLR guy suggests not using an AST at all; I'm guessing this is because string templates only work on ANTLR parse trees whose structure ANTLR understands, vs. ASTs which are whatever you home-rolled.)
If you get all of this right, what you are likely to discover is that modifying the parse tree/AST is harder than it looks. For almost any interesting task on complex languages, you need information which is not trivial to extract from the tree (e.g., what is the meaning of this identifier?, where is this variable used?,...) I call this the problem of Life After Parsing. My main point is that it takes a lot of machinery to modify ASTs and regenerate code; be aware of the size of your project.

How to modify an AST from YOSYS? And how to synthesis a modified AST to Verilog code?

We know that we can get AST textfile of Verilog code. Now I want to modify the AST to get some new features, Is ANTLR right for this job,or which software should I use? Or How should I do? Then, I want to synthesis the modified AST to generate Verilog code? Can YOSYS finish this Job? What should I do? Can you tell me in detail?
Thanks for your help!
ANTLR parses, but is not particularly good at supporting modifications to the AST or regenerating the source code accurately.
Our DMS Software Reengineeringing is designed to do these tasks. See our Verilong Front End for round-trip parsing and un-parsing, and DMS's support for modifying ASTs using source-to-source transformations.
With ANTLR 4 for AST transforming you can use generated class Visitor by overriding Visit methods. All Visit methods should return AST node of your target type.

Pretty print ANTLR grammar

Some tools output an Antlr grammar in a human-unreadable form, at least with ugly placing of parens and indentation. I'd like to transform the grammar into a more readable (standard?) form. The only reference I found is ANTLR pretty printer which is quite old, and looking at its source, it seems to be removing parts of a grammar rather than pretty print it.
How can I format/pretty print a grammar file?
I know of no tool that does this. The one you mentioned, prettyPrinter, is written in - and seems to handle only - ANTLR v2.x grammars, making it unsuitable for v3 grammars.
If you're going to write your own, I'd recommend using the grammar of ANTLR v3 itself to parse a .g grammar file and emit it in a readable form. Terence Parr has posted the grammar here: http://www.antlr.org/grammar/ANTLR
I just installed an Antlr plugin for Eclipse. It can do a lot more than syntax highlight and code formatting...

Source for parsing C grammar using JavaCC

As an project assignment, I need to parse a plain-C grammar from Java to generate AST output. As a startup, I am using the file c.jj that I have found among grammar files at
http://java.net/projects/javacc/sources/svn/
but I found that it only has syntactic and lexical actions and no real semantics for parsing C source. Is there some other source that incorporate typedef, variables, construct functions, include files?
You could go looking for a complete grammar. Will you learn much this way?
You could ask your lecturer which would impress them more: implementing some small subset of C grammar by writing your own rules, or by searching google for alternative complete rules?
I trust writing your own rules - and even your own hand-crafted parser - will be more a more useful exercise. Even if its only parsing expressions.