Purpose of antlr in xtext - antlr

I'm new to Xtext and wondering what's the purpose of antlr is in xtext. As I've understand so far, antlr generate a parser based on the grammar and the parser then deal with the text models. Right?
And what about the other generated stuff like the editor or the ecore. Are there other components behind xtext which generate them?

Xtext needs a parser generator to produce a parser for the language you define. They could have built one of their own. They chose to use ANTLR instead.
I don't know what other third party machinery they might have chosen to use.

I've been hacking one Xtext based plugin and from what I saw I think it works like this:
Xtext has it's own BNF syntax, which is very similar to ANTLR one. In fact its it's subset.
Xtext takes your grammar, and generates the ANTLR one from it(.g file). The generated ANTLR grammar adds specific actions to your BNF rules. The actions code interacts with the Xtext runtime and (maybe) with the Eclipse itself. The .g file is processed using some older version of ANTLR and .java file is generated. This file is then compiled.

Related

Xtext based language within Intellij Idea

I want to make a plugin for a language for the Intellij Idea IDE. The language has been developped using Eclipse Xtext and is open source. A plugin already exists for Eclipse.
My goal is to port this language to Intellij Idea. I want to be able to use Intellij to create source files, to have the specific syntax highlighting and to be able to compile and run programs written with this language.
Is there a simple way to generate the Intellij Idea plugin using the Xtext project?
If not is there an efficient solution to be able to have the specific syntax highlighting in Intellij? (an automatic way if possible, I would prefer not rewriting everything everytime the Xtext project is updated)
Short answer
Yes, with a bit of work.
Long Answer
Sadly, Xtext uses antlr in the background and IntelliJ use their own grammar kit based on Parsing Expression Grammars. As such, the parsing and editor code generated by XText, as you might have guessed, will not work.
In order to get your language working in IntelliJ you will need to:
Create grammar *.bnf file
Generate lexer *.flex file, possibly tweak it and then run JFlex generator
Implement helper classes to provide, among others, file recognition via file extension, syntax highlighting, color settings page, folding, etc.
The *.flex file is generated from the bnf. Luckily, most of the classes in step 3 follow a very similar structure so they can be easily generated (more on that later). So basically, if you manage to generate the *.bnf file, you are 80% there.
Although from different technologies, the syntax of bnf files is very similar to XText files. I recently migrated some antlr grammars to IntelliJ's bnf and I had to do very small changes. Thus, it should be possible to autogenerate the bnf files from your XText ones.
That brings me back to point 3. Using XTend, Epsilon's EGL, or similar, it would be easy to generate all the boiler plate classes. As part of the migration I mentioned before I also did this. I am in the process of making the code public, so I will post it here when done and add some details.

antlr - generate grammar from java source code

I am wondering if I can generate ANTLR grammar from java source code. I want to do some kind of research project, but I am just exploring different open sources to see which one is best.
For ANTLR, do I always have to write a grammar and pass it to the ANTLR?
Is there a way to generate grammar from an existing Java source code?
Not easily. ANTLR generate a recursive descent parser from your grammar, encoding the tests into procedural code, as well as lots of other bookkeeping stuff.
Knowing how the code is generated, you might be able to take it apart but you'll have to reach into the middle of generated statements and that isn't easy without a full parser for the generated language. (Hint: regex won't work).
I don't see a lot of point of this exercise. Why don't you just use the original grammar?

xText and ANTLR

My current project is focusing on code generation from DSL ( i.e., high-level specification). More specifically, developers write high-level specifications and these specifications are parsed and generate code in Java and Android.
For parser, I have used ANTLR grammar and for code generation I have used StringTemplateFiles.
However, developer writes high-level specifications in notepad. Because of it, I am not able to provide syntax highlighting, coloring , and error handling. To provide this support, I am thinking to use xText.
I am thinking to integrate xText as follows:
Developers will write high-level specifications into editor support provide by xtext (Basically, I will write xtext grammar and generate editor support). Here, Xtext editor will handle syntax coloring, syntax highlighting and error handling.
I will take all these specifications as .txt inputs and then ANTLR parse these files. And generate Java and Android code.
Need your suggestions on the following questions:
(1) How can I extract files, written in xtext editor, and provide input to ANTLR parser? OR (2) Should I stick with xText and try to integrate ANTLR parser and xtext? (kindly advise the - how could I integrate xtext and ANTLR with a simple example) OR (3) Should I use only ANTLR and StringTemplateFiles and try to create Eclipse plugin out of it?
Other alternative suggestions are also welcomed.
You don't need to integrate XText and ANTLR; XText already uses ANTLR for actual parsing.
Xtext is based on Antlr. So no need to integrate Antlr and Xtext.
I advice you to create an Xtext project on Eclipse and to generate the artifacts using the mwe2 file. Then in the src-gen folder you will can find your Antlr grammar generated from your Xtext grammar.
If you want to generate code from your Xtext grammar you can use Xtend. It's provide already everything that you need. See : https://eclipse.org/Xtext/documentation/207_template.html.
Otherwise if you have already an antlr grammar and a generator, you will need to (re) write it in Xtext.
By example:
public class CustomGenerator extends AbstractHandler{
#Override
public Object execute(ExecutionEvent event) throws ExecutionException {
ISelection selection = HandlerUtil.getCurrentSelection(event);
//If your selection is an IFile
//Selection from the Project Explorer
if(selection instanceof IStructuredSelection){
IStructuredSelection structuredSelection = (IStructuredSelection) selection;
Object element = structuredSelection.getFirstElement();
if(element instanceof IFile){
IFile file = (IFile) element;
InputStream contentOfYouFile = ((IFile) element).getContents();
//make your job
}
}
return null;
}
}

Convert simple Antlr grammar to Xtext

I want to convert a very simple Antlr grammar to Xtext, so no syntactic predicates, no fancy features of Antlr not provided by Xtext. Consider this grammar
grammar simple; // Antlr3
foo: number+;
number: NUMBER;
NUMBER: '0'..'9'+;
and its Xtext counterpart
grammar Simple; // Xtext
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate Simple "http://www.example.org/Simple"
Foo: dummy=Number+;
Number: NUMBER_TOKEN;
terminal NUMBER_TOKEN: '0'..'9'+;
Xtext uses Antlr behind the scenes, but the two format are not exactly the same. There are quite a few annoying (and partly understandable) things I have to modify, including:
Prefix terminals with the terminal keyword
Include import "http://www.eclipse.org/emf/2002/Ecore" as ecore to make terminals work
Add a feature to the top-level rule, e.g. foo: dummy=number+
Keep in mind that rule and terminal names have to be unique even case-insensitive.
Optionally, capitalize the first letter of rule names to follow Java convention.
Is there a tool to make this conversion automatically at least for simple cases? If not, is there a more complete checklist of such required modifications?
It's basically not possible to do this conversion automatically since the Antlr grammar lacks information that is required in the Xtext grammar. The rule names in Xtext will be used to create classes from them. There are assignments in Xtext that will become getters and setters in those classes. However, these assignments should not be used for every rule call since there are special patterns in Xtext that allow to reduce the noise in the resulting AST. Stuff like that makes it hardly possible to do this transformation automatically. However, it's usually straight forward to copy the Antlr grammar into the Xtext editor and fix the issues manually.

Is there a way to use one ANTLR grammar that targets multiple languages?

I am developing a language service in Visual Studio using an ANTLR grammar for a custom language. However, the grammar is filled with C++ code to handle preprocessor directives and for efficiency reasons for the compiler.
Language services for Visual Studio are a pain to write in C++, so I need a C# parser for the same language. That means I have to set language=CSharp2 and strip all the C++ code from the grammar.
I am thinking of writing a little exporter that strips away all the C++ code from the grammar, and converts simple statements like { $channel = HIDDEN; } to { $channel = TokenChannels.Hidden; }.
Is there a more clever method to do this? Like through templates, or little tricks to embed both languages in the grammar?
I'd break the problem up into two phases using an AST. Have your parser in a target language neutral grammar (that produces an AST) and use the -target option to the Antlr Tool to generate the parser in the target language of your choice (C++, C#, Java, etc).
Then implement AST walkers in the target language with your actions. The benefit of this is that once you get one AST walker finished you can copy it and just change the actions for another target language.