How to use other grammars in your grammar? - grammar

I would like to use Liquid grammar inside a specific block in my grammar:
https://github.com/Shopify/liquid-tm-grammar
but I can't figure out how to add it or whether it's even possible?

Related

Is there a way to generate builder using the antlr4 grammar?

I understand that one could generate lexer and parser given the antl4 grammar but Is there a way to generate builder using the antlr4 grammar? That way client can use the builder to construct the possible structure specified in the grammar while the server can use the generated parser to parse the structure.
There is, yes. Such a sentence generator can walk the ATN and create sentences according to the grammar (see my antlr4-vscode extension of how this can be implemented). However, unless you have a very simple grammar with no recursions or iterations, you will probably not be able to generate a fixed set of sentences, since there are infinitive possible combinations.

antlr - generate grammar from java source code

I am wondering if I can generate ANTLR grammar from java source code. I want to do some kind of research project, but I am just exploring different open sources to see which one is best.
For ANTLR, do I always have to write a grammar and pass it to the ANTLR?
Is there a way to generate grammar from an existing Java source code?
Not easily. ANTLR generate a recursive descent parser from your grammar, encoding the tests into procedural code, as well as lots of other bookkeeping stuff.
Knowing how the code is generated, you might be able to take it apart but you'll have to reach into the middle of generated statements and that isn't easy without a full parser for the generated language. (Hint: regex won't work).
I don't see a lot of point of this exercise. Why don't you just use the original grammar?

Generate only a Lexer from Antlr

I am attempting to use Antlr to tokenize and classify the tokens of an input stream. Does anyone know of a way to generate only a Lexer from Antlr using a grammar with only Lexer rules?
You can specify the type of grammar you want using the grammar title line.
grammar MyGrammar;
for combined grammars.
lexer grammar MyLexer;
for a lexer grammar (etc.). Of course in a pure lexer grammar you may only use lexer rules.
You can basically generate a parser and extend the listener class then inside every exitMethod() push the tokens into a stack.
You can't generate only a lexer. If you are not familiar with ANTLR 4 grammar or the steps required to generate a parser, I advise you to spend 10 mins in reading this book "The definitive of ANTLR 4".

Pretty print ANTLR grammar

Some tools output an Antlr grammar in a human-unreadable form, at least with ugly placing of parens and indentation. I'd like to transform the grammar into a more readable (standard?) form. The only reference I found is ANTLR pretty printer which is quite old, and looking at its source, it seems to be removing parts of a grammar rather than pretty print it.
How can I format/pretty print a grammar file?
I know of no tool that does this. The one you mentioned, prettyPrinter, is written in - and seems to handle only - ANTLR v2.x grammars, making it unsuitable for v3 grammars.
If you're going to write your own, I'd recommend using the grammar of ANTLR v3 itself to parse a .g grammar file and emit it in a readable form. Terence Parr has posted the grammar here: http://www.antlr.org/grammar/ANTLR
I just installed an Antlr plugin for Eclipse. It can do a lot more than syntax highlight and code formatting...

Convert simple Antlr grammar to Xtext

I want to convert a very simple Antlr grammar to Xtext, so no syntactic predicates, no fancy features of Antlr not provided by Xtext. Consider this grammar
grammar simple; // Antlr3
foo: number+;
number: NUMBER;
NUMBER: '0'..'9'+;
and its Xtext counterpart
grammar Simple; // Xtext
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate Simple "http://www.example.org/Simple"
Foo: dummy=Number+;
Number: NUMBER_TOKEN;
terminal NUMBER_TOKEN: '0'..'9'+;
Xtext uses Antlr behind the scenes, but the two format are not exactly the same. There are quite a few annoying (and partly understandable) things I have to modify, including:
Prefix terminals with the terminal keyword
Include import "http://www.eclipse.org/emf/2002/Ecore" as ecore to make terminals work
Add a feature to the top-level rule, e.g. foo: dummy=number+
Keep in mind that rule and terminal names have to be unique even case-insensitive.
Optionally, capitalize the first letter of rule names to follow Java convention.
Is there a tool to make this conversion automatically at least for simple cases? If not, is there a more complete checklist of such required modifications?
It's basically not possible to do this conversion automatically since the Antlr grammar lacks information that is required in the Xtext grammar. The rule names in Xtext will be used to create classes from them. There are assignments in Xtext that will become getters and setters in those classes. However, these assignments should not be used for every rule call since there are special patterns in Xtext that allow to reduce the noise in the resulting AST. Stuff like that makes it hardly possible to do this transformation automatically. However, it's usually straight forward to copy the Antlr grammar into the Xtext editor and fix the issues manually.