ANTLR Comment Propagation - antlr

Is there a method to propagate a comment in ANTLR to the code generated?
e.g. if I have the Subversion revision number keyword ($Rev$) in a comment within the *.g4 file, is there a way for this to be within the generated code, so that I know that the parser was generated with that revisions version of the language?
Cheers,
Adam

At this time, we are not copying the comments from the grammar into the generated code, although we should. Added https://github.com/antlr/antlr4/issues/375

Related

Xtext based language within Intellij Idea

I want to make a plugin for a language for the Intellij Idea IDE. The language has been developped using Eclipse Xtext and is open source. A plugin already exists for Eclipse.
My goal is to port this language to Intellij Idea. I want to be able to use Intellij to create source files, to have the specific syntax highlighting and to be able to compile and run programs written with this language.
Is there a simple way to generate the Intellij Idea plugin using the Xtext project?
If not is there an efficient solution to be able to have the specific syntax highlighting in Intellij? (an automatic way if possible, I would prefer not rewriting everything everytime the Xtext project is updated)
Short answer
Yes, with a bit of work.
Long Answer
Sadly, Xtext uses antlr in the background and IntelliJ use their own grammar kit based on Parsing Expression Grammars. As such, the parsing and editor code generated by XText, as you might have guessed, will not work.
In order to get your language working in IntelliJ you will need to:
Create grammar *.bnf file
Generate lexer *.flex file, possibly tweak it and then run JFlex generator
Implement helper classes to provide, among others, file recognition via file extension, syntax highlighting, color settings page, folding, etc.
The *.flex file is generated from the bnf. Luckily, most of the classes in step 3 follow a very similar structure so they can be easily generated (more on that later). So basically, if you manage to generate the *.bnf file, you are 80% there.
Although from different technologies, the syntax of bnf files is very similar to XText files. I recently migrated some antlr grammars to IntelliJ's bnf and I had to do very small changes. Thus, it should be possible to autogenerate the bnf files from your XText ones.
That brings me back to point 3. Using XTend, Epsilon's EGL, or similar, it would be easy to generate all the boiler plate classes. As part of the migration I mentioned before I also did this. I am in the process of making the code public, so I will post it here when done and add some details.

Is there an Antlr Grammar available for LLVM IR?

I want to parse a LLVM IR file and perform certain operation accordingly, Wondering if there is an ANTLR Grammar available for LLVM IR, this will make my job much simpler?
The largest repository of grammars I know of is here. I don't see your grammar in the list, sorry. Terence Parr makes mention of it in this article, but that's all I can find, but it's quite old and seems to be based on ANTLR3.
I got this file from one of the github repositories:
https://github.com/rwl/JLLVM/blob/master/src/cn/edu/sjtu/jllvm/VMCore/Parser/LLVM.g
It is a part of JLLVM source code(JLLVM is a Java version of LLVM Core. To get more info about JLLVM you should follow the link: http://tcloud.sjtu.edu.cn/wiki/index.php/User:Liuhaots:JLLVM)

antlr - generate grammar from java source code

I am wondering if I can generate ANTLR grammar from java source code. I want to do some kind of research project, but I am just exploring different open sources to see which one is best.
For ANTLR, do I always have to write a grammar and pass it to the ANTLR?
Is there a way to generate grammar from an existing Java source code?
Not easily. ANTLR generate a recursive descent parser from your grammar, encoding the tests into procedural code, as well as lots of other bookkeeping stuff.
Knowing how the code is generated, you might be able to take it apart but you'll have to reach into the middle of generated statements and that isn't easy without a full parser for the generated language. (Hint: regex won't work).
I don't see a lot of point of this exercise. Why don't you just use the original grammar?

ANTLR and content assist in Eclipse

I have a project in Eclipse where I have an editor for a custom language. I am using ANTLR to generate the compiler for it. What I need is to add content assist to the editor.
The input is a source code in the custom language, and the position of the character where the user requested content assist. The source code is most of time incomplete as the user can ask for content assist any time. What I need is to calculate the list of possible tokens that are valid for the given position.
It is possible to write a custom code to do the calculation, but that code would have to be manually kept in sync with the grammar. I figured the parser is doing something similar. It has to be able to determine at a given context what are the acceptable tokens. Is it possible to "reuse" that? What is the best practice in creating content assist anyway?
Thanks,
Balint
Have a look at Xtext. Xtext uses Antlr3 under the hood and provides content assist for the Antlr based languages. Have a look especially into package org.eclipse.xtext.ui.editor.contentassist.
You may consider to redefine your grammar with Xtext, which would provide the content assist out-of-the-box. It is not possible to reuse the Antlr grammar of a custom language.

Stopping doxygen searching for (and assuming) non-existant variables in source code

Im using doxygen outside of its design, but well within its capability. I have a bunch of essentially text files, appended with some doxygen tags. I am successfully generating doxygen output. However, somehow doxygen occasionally discovers what it assumes to be a variable, and proceeds to document it using surrounding text, causing a lot of confusing documentation. I cant see any direct relationship between these anomalies, only that they're reproducing the same output on each run, and what I can see is at least some are next to a ';' or a '='.
I only want doxygen to document what I've manually tagged. I am hoping to remove any occurrence of these anomalies, however I cannot alter existing text. I can only add doxygen tags, or alter the configuration file. Any ideas?
Many thanks.
Because in my particular case, I do not need any automatically generated documentation, only that which I have tagged with doxygen tags, setting
EXCLUDE_SYMBOLS = *
removes any instance of doxygen "finding" and documenting variables. This however may remove any ability to find any class declarations, namespaces or functions, however this is acceptable for me.