My grammar is given by:
Model:
'module' (mn=ID)?
(func+=Function)+
'end_module'
;
Function:
'function' name=ID '('')'
(vars+=ID)*
'end_function'
;
I can find the token like 'function', '(' etc.
How can If force a new line after the token 'module' if the optional data mn does not exist and after mn if it does exist?
How can I indent the block begin 'module' and 'end_module' as well as 'function' and 'end_function'?
The formatting I am looking for:
module test
function fdf ()
str1
str2
end_function
function ff ()
end_function
end_module
So far I do generate the formatting stubs by using:
formatter = {
generateStub = true
}
that should be rather straight forward e.g.
#Inject extension MyDslGrammarAccess
def dispatch void format(Model model, extension IFormattableDocument document) {
model.regionFor.keyword(modelAccess.end_moduleKeyword_3).prepend[newLine]
if (model.mn != null) {
model.regionFor.feature(MyDslPackage.Literals.MODEL__MN).append[newLine]
interior(
model.regionFor.feature(MyDslPackage.Literals.MODEL__MN),
model.regionFor.keyword(modelAccess.end_moduleKeyword_3)
) [indent]
} else {
model.regionFor.keyword(modelAccess.moduleKeyword_0).append[newLine]
interior(
model.regionFor.keyword(modelAccess.moduleKeyword_0),
model.regionFor.keyword(modelAccess.end_moduleKeyword_3)
) [indent]
}
for (Function func : model.getFunc()) {
func.format;
}
}
def dispatch void format(Function function, extension IFormattableDocument document) {
function.regionFor.keyword(functionAccess.functionKeyword_0).append[newLine].prepend[newLine]
function.regionFor.keyword(functionAccess.end_functionKeyword_5).prepend[newLine]
interior(
function.regionFor.keyword(functionAccess.functionKeyword_0),
function.regionFor.keyword(functionAccess.end_functionKeyword_5)
) [indent]
}
As proposed in Max's Answer, it is possible to cope with whitespace-aware languages starting from Xtext v2.8. Check it out!.
In your case, I guess you should define your grammar as follows:
Model:
'module' (mn=ID)?
BEGIN
(func+=Function)+
END
'end_module'
;
Function:
'function' name=ID '('')'
BEGIN
(vars+=ID)*
END
'end_function'
;
terminal BEGIN: 'synthetic:BEGIN';
terminal END: 'synthetic:END';
In case you would also want to allow 'empty-bodied' functions, I guess you should change the rule above as follows:
Function:
'function' name=ID '('')'
(BEGIN
(vars+=ID)*
END)?
'end_function'
;
Hope it helps!
Related
How to print an object in NQP ? (For debugging purposes)
It is easy in Raku:
say that is calling gist in its short loop code
dd The tiny Data Dumper as shown in this post
class Toto { has $.member = 42; }
class Titi { has $.member = 41; has $.toto = Toto.new }
my $ti = Titi.new;
say $ti;
# Titi.new(member => 41, toto => Toto.new(member => 42))
dd $ti;
# Titi $ti = Titi.new(member => 41, toto => Toto.new(member => 42))
It seems more complicated in NQP
class Toto { has $!member; sub create() {$!member := 42}};
class Titi { has $!member; has $!toto; sub create() {$!member := 41; $!toto := Toto.new; $!toto.create; }}
my $ti := Titi.new;
say($ti);
Cannot stringify this object of type P6opaque (Titi)
Of course, no .gist method, the code calls nqp::encode which finally expects a string.
Reducing the problem to an MRE:
class foo {}
say(foo.new); # Cannot stringify ...
Simplifying the solution:
class foo { method Str () { 'foo' } }
say(foo.new); # foo
In summary, add a Str method.
This sounds simple but there's a whole lot of behind-the-scenes stuff to consider/explain.
nqp vs raku
The above solution is the same technique raku uses; when a value is expected by a routine/operation to be a string, but isn't, the language behavior is to attempt to coerce to a string. Specifically, see if there's a Str method that can be called on the value, and if so, call it.
In this case NQP's NQPMu, which is way more barebones than raku's Mu, doesn't provide any default Str method. So a solution is to manually add one.
More generally, NQP is a pretty hostile language unless you know raku fairly well and have gone thru A course on Rakudo and NQP internals.
And once you're up to speed on the material in that course, I recommend you consider the IRC channels #raku-dev and/or #moarvm as your first port of call rather than SO (unless your goal is specifically to increase SO coverage of nqp/moarvm).
Debugging the compiler code
As you will have seen, the NQP code you linked calls .say on a filehandle.
That then calls this method.
That method's body is $str ~ "\n". That code will attempt to coerce $str to a string (just as it would in raku). That's what'll be generating the "Cannot stringify" error.
A search for "Cannot stringify" in the NQP repo only matched some Java code. And I bet you're not running Rakudo on the JVM. That means the error message must be coming from MoarVM.
The same search in the MoarVM repo yields this line in coerce.c in MoarVM.
Looking backwards in the routine containing that line we see this bit:
/* Check if there is a Str method. */
MVMROOT(tc, obj, {
strmeth = MVM_6model_find_method_cache_only(tc, obj,
tc->instance->str_consts.Str);
});
This shows the backend, written in C, looking for and invoking a "method" called Str. (It's relying on an internal API (6model) that all three layers of the compiler (raku, nqp, and backends) adhere to.)
Customizing the Str method
You'll need to customize the Str method as appropriate. For example, to print the class's name if it's a type object, and the value of its $!bar attribute otherwise:
class foo {
has $!bar;
method Str () { self ?? nqp::coerce_is($!bar) !! self.HOW.name(self) }
}
say(foo.new(bar=>42)); # 42
Despite the method name, the nqp say routine is not expecting a raku Str but rather an nqp native string (which ends up being a MoarVM native string on the MoarVM backend). Hence the need for nqp::coerce_is (which I found by browsing the nqp ops doc).
self.HOW.name(self) is another example of the way nqp just doesn't have the niceties that raku has. You could write the same code in raku but the idiomatic way to write it in raku is self.^name.
Currently, what I have is a list and hash discriminator. It does not work on object.
sub print_something ($value, :$indent = 0, :$no-indent=0) {
if nqp::ishash($value) {
print_hash($value, :$indent);
} elsif nqp::islist($value) {
print_array($value, :$indent);
} else {
if $no-indent {
say($value);
} else {
say_indent($indent, $value);
}
}
}
Where
sub print_indent ($int, $string) {
my $res := '';
my $i := 0;
while $i < $int {
$res := $res ~ ' ';
$i := $i + 1;
}
$res := $res ~ $string;
print($res);
}
sub print_array (#array, :$indent = 0) {
my $iter := nqp::iterator(#array);
say_indent($indent, '[');
while $iter {
print_value(nqp::shift($iter), :indent($indent+1));
}
say_indent($indent, ']');
}
sub print_hash (%hash, :$indent = 0) {
my $iter := nqp::iterator(%hash);
say_indent($indent, '{');
while $iter {
my $pair := nqp::shift($iter);
my $key := nqp::iterkey_s($pair);
my $value := nqp::iterval($pair);
print_indent($indent + 1, $key ~ ' => ');
print_value($value, :indent($indent+1), :no-indent(1));
}
say_indent($indent, '}');
}
I am testing the idea of making my dsl Jvm compatible and I wanted to test the possibility of extending Xbase and using the interpreter. I have tried to make a minimal test project to use with the interpreter but I am getting a runtime error. I think I understand the general concepts of adapting Xbase, but am unsure about how the setup/entrypoints for the interpreter and could not find any information regarding the error I am getting or how to resolve. Here are the relevant files for my situation:
Text.xtext:
import "http://www.eclipse.org/xtext/xbase/Xbase" as xbase
import "http://www.eclipse.org/xtext/common/JavaVMTypes" as types
Program returns Program:
{Program}
'program' name=ID '{'
variables=Var_Section?
run=XExpression?
'}'
;
Var_Section returns VarSection:
{VarSection}
'variables' '{'
decls+=XVariableDeclaration+
'}'
;
#Override // Change syntax
XVariableDeclaration returns xbase::XVariableDeclaration:
type=JvmTypeReference name=ID '=' right=XLiteral ';'
;
#Override // Do not allow declarations outside of variable region
XExpressionOrVarDeclaration returns xbase::XExpression:
XExpression;
TestJvmModelInferrer:
def dispatch void infer(Program element, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(element.toClass(element.fullyQualifiedName)) [
documentation = element.documentation
if (element.variables !== null) {
for (decl : element.variables.decls) {
members += decl.toField(decl.name, decl.type) [
static = true
initializer = decl.right
visibility = JvmVisibility.PUBLIC
]
}
}
if (element.run !== null) {
members += element.run.toMethod('main', typeRef(Void::TYPE)) [
parameters += element.run.toParameter("args", typeRef(String).addArrayTypeDimension)
visibility = JvmVisibility.PUBLIC
static = true
body = element.run
]
}
]
}
Test case:
#Inject ParseHelper<Program> parseHelper
#Inject extension ValidationTestHelper
#Inject XbaseInterpreter interpreter
#Test
def void basicInterpret() {
val result = parseHelper.parse('''
program program1 {
variables {
int var1 = 0;
double var2 = 3.4;
}
var1 = 13
}
''')
result.assertNoErrors
var interpretResult = interpreter.evaluate(result.run)
println(interpretResult.result)
Partial stack trace:
java.lang.IllegalStateException: Could not access field: program1.var1 on instance: null
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._assignValueTo(XbaseInterpreter.java:1262)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.assignValueTo(XbaseInterpreter.java:1221)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:1213)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:216)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:190)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:180)
The interpreter does only support expressions, but does not work with types that are created by a JvmModelInferrer. Your code tries to work with fields of such an inferred type.
Rather than using the interpreter, I'd recommend to use an InMemoryCompiler in your test. The domainmodel example may serve as an inspiration: https://github.com/eclipse/xtext-eclipse/blob/c2b15c3ec118c4c200e2b28ea72d8c9116fb6800/org.eclipse.xtext.xtext.ui.examples/projects/domainmodel/org.eclipse.xtext.example.domainmodel.tests/xtend-gen/org/eclipse/xtext/example/domainmodel/tests/XbaseIntegrationTest.java
You may find this project interesting, which (among other stuff) implements an interpreter for Xtend based on the Xbase interpreter. It might be a bit outdated, though, and also will not fully support all Xtend concepts. But it could be a starting point, and your contrbutions are welcome :-)
https://github.com/kbirken/xtendency
I have some classes (and will need quite a few more) that look like this:
use Unit;
class Unit::Units::Ampere is Unit
{
method TWEAK { with self {
.si = True;
# m· kg· s· A ·K· mol· cd
.si-signature = [ 0, 0, 0, 1, 0, 0, 0 ];
.singular-name = "ampere";
.plural-name = "ampere";
.symbol = "A";
}}
sub postfix:<A> ($value) returns Unit::Units::Ampere is looser(&prefix:<->) is export(:short) {
return Unit::Units::Ampere.new( :$value );
};
sub postfix:<ampere> ($value) returns Unit::Units::Ampere is looser(&prefix:<->) is export(:long) {
$value\A;
};
}
I would like to be able to construct and export the custom operators dynamically at runtime. I know how to work with EXPORT, but how do I create a postfix operator on the fly?
I ended up basically doing this:
sub EXPORT
{
return %(
"postfix:<A>" => sub is looser(&prefix:<->) {
#do something
}
);
}
which is disturbingly simple.
For the first question, you can create dynamic subs by returning a sub from another. To accept only an Ampere parameter (where "Ampere" is chosen programmatically), use a type capture in the function signature:
sub make-combiner(Any:U ::Type $, &combine-logic) {
return sub (Type $a, Type $b) {
return combine-logic($a, $b);
}
}
my &int-adder = make-combiner Int, {$^a + $^b};
say int-adder(1, 2);
my &list-adder = make-combiner List, {(|$^a, |$^b)};
say list-adder(<a b>, <c d>);
say list-adder(1, <c d>); # Constraint type check fails
Note that when I defined the inner sub, I had to put a space after the sub keyword, lest the compiler think I'm calling a function named "sub". (See the end of my answer for another way to do this.)
Now, on to the hard part: how to export one of these generated functions? The documentation for what is export really does is here: https://docs.perl6.org/language/modules.html#is_export
Half way down the page, they have an example of adding a function to the symbol table without being able to write is export at compile time. To get the above working, it needs to be in a separate file. To see an example of a programmatically determined name and programmatically determined logic, create the following MyModule.pm6:
unit module MyModule;
sub make-combiner(Any:U ::Type $, &combine-logic) {
anon sub combiner(Type $a, Type $b) {
return combine-logic($a, $b);
}
}
my Str $name = 'int';
my $type = Int;
my package EXPORT::DEFAULT {
OUR::{"&{$name}-eater"} := make-combiner $type, {$^a + $^b};
}
Invoke Perl 6:
perl6 -I. -MMyModule -e "say int-eater(4, 3);"
As hoped, the output is 7. Note that in this version, I used anon sub, which lets you name the "anonymous" generated function. I understand this is mainly useful for generating better stack traces.
All that said, I'm having trouble dynamically setting a postfix operator's precedence. I think you need to modify the Precedence role of the operator, or create it yourself instead of letting the compiler create it for you. This isn't documented.
In short: how do I implement dynamic variables in ANTLR?
I come to you again with a basic ANTLR question.
I have this grammar:
grammar Amethyst;
options {
language = Java;
}
#header {
package org.omer.amethyst.generated;
import java.util.HashMap;
}
#lexer::header {
package org.omer.amethyst.generated;
}
#members {
HashMap memory = new HashMap();
}
begin: expr;
expr: (defun | println)*
;
println:
'println' atom {System.out.println($atom.value);}
;
defun:
'defun' VAR INT {memory.put($VAR.text, Integer.parseInt($INT.text));}
| 'defun' VAR STRING_LITERAL {memory.put($VAR.text, $STRING_LITERAL.text);}
;
atom returns [Object value]:
INT {$value = Integer.parseInt($INT.text);}
| ID
{
Object v = memory.get($ID.text);
if (v != null) $value = v;
else System.err.println("undefined variable " + $ID.text);
}
| STRING_LITERAL
{
String v = (String) memory.get($STRING_LITERAL.text);
if (v != null) $value = String.valueOf(v);
else System.err.println("undefined variable " + $STRING_LITERAL.text);
}
;
INT: '0'..'9'+ ;
STRING_LITERAL: '"' .* '"';
VAR: ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'0'..'9')* ;
ID: ('a'..'z'|'A'..'Z'|'0'..'9')+ ;
LETTER: ('a..z'|'A'..'Z')+ ;
WS: (' '|'\t'|'\n'|'\r')+ {skip();} ;
What it does (or should do), so far, is have a built-in "println" function to do exactly what you think it does, and a "defun" rule to define variables.
When "defun" is called on either a string or integer, the value is put into the "memory" HashMap with the first parameter being the variable's name and the second being its value.
When println is called on an atom, it should display the atom's value. The atom can be either a string or integer. It gets its value from memory and returns it. So for example:
defun greeting "Hello world!"
println greeting
But when I run this code, I get this error:
line 3:8 no viable alternative at input 'greeting'
null
NOTE: This output comes when I do:
println "greeting"
Output:
undefined variable "greeting"null
Does anyone know why this is so? Sorry if I'm not being clear, I don't understand most of this.
defun greeting "Hello world!"
println greeting
But when I run this code, I get this error:
line 3:8 no viable alternative at input 'greeting'
Because the input "greeting" is being tokenized as a VAR and a VAR is no atom. So the input defun greeting "Hello world!" is properly matched by the 2nd alternative of the defun rule:
defun
: 'defun' VAR INT // 1st alternative
| 'defun' VAR STRING_LITERAL // 2nd alternative
;
but the input println "greeting" cannot be matched by the println rule:
println
: 'println' atom
;
You must realize that the lexer does not produce tokens based on what the parser tries to match at a particular time. The input "greeting" will always be tokenized as a VAR, never as an ID rule.
What you need to do is remove the ID rule from the lexer, and replace ID with VAR inside your parser rules.
I am writing an ANTRL grammar for translating one language to another but the documentation on using the HIDDEN channel is very scarce. I cannot find an example anywhere. The only thing I have found is the FAQ on www.antlr.org which tells you how to access the hidden channel but not how best to use this functionality. The target language is Java.
In my grammar file, I pass whitespace and comments through like so:
// Send runs of space and tab characters to the hidden channel.
WHITESPACE
: (SPACE | TAB)+ { $channel = HIDDEN; }
;
// Single-line comments begin with --
SINGLE_COMMENT
: ('--' COMMENT_CHARS NEWLINE) {
$channel=HIDDEN;
}
;
fragment COMMENT_CHARS
: ~('\r' | '\n')*
;
// Treat runs of newline characters as a single NEWLINE token.
NEWLINE
: ('\r'? '\n')+ { $channel = HIDDEN; }
;
In my members section I have defined a method for writing hidden channel tokens to my output StringStream...
#members {
private int savedIndex = 0;
void ProcessHiddenChannel(TokenStream input) {
List<Token> tokens = ((CommonTokenStream)input).getTokens(savedIndex, input.index());
for(Token token: tokens) {
if(token.getChannel() == token.HIDDEN_CHANNEL) {
output.append(token.getText());
}
}
savedIndex = input.index();
}
}
Now to use this, I have to call the method after every single token in my grammar.
myParserRule
: MYTOKEN1 { ProcessHiddenChannel(input); }
MYTOKEN2 { ProcessHiddenChannel(input); }
;
Surely there must be a better way?
EDIT: This is an example of the input language:
-- -----------------------------------------------------------------
--
--
-- Name Description
-- ==================================
-- IFM1/183 Freq Spectrum Inversion
--
-- -----------------------------------------------------------------
PROCEDURE IFM1/183
TITLE "Freq Spectrum Inversion";
HELP
Freq Spectrum Inversion
ENDHELP;
PRIVILEGE CTRL;
WINDOW MANDATORY;
INPUT
$Input : #NO_YES
DEFAULT select %YES when /IFMS1/183.VALUE = %NO;
%NO otherwise
endselect
PROMPT "Spec Inv";
$Forced_Cmd : BOOLEAN
Default FALSE
Prompt "Forced Commanding";
DEFINE
&RetCode : #PSTATUS := %OK;
&msg : STRING;
&Input : BOOLEAN;
REQUIRE AVAILABLE(/IFMS1)
MSG "IFMS1 not available";
REQUIRE /IFMS1/001.VALUE = %MON_AND_CTRL
MSG "IFMS1 not in control mode";
BEGIN -- Procedure Body --
&msg := "IFMS1/183 -> " + toString($Input) + " : ";
-- pre-check
IF /IFMS1/183.VALUE = $Input
AND $Forced_Cmd = FALSE THEN
EXIT (%OK, MSG &msg + "already set");
ENDIF;
-- command
IF $Input = %YES THEN &Input:= TRUE;
ELSE &Input:= FALSE;
ENDIF;
SET &RetCode := SEND IFMS1.FREQPLAN
( $FreqSpecInv := &Input);
IF &RetCode <> %OK THEN
EXIT (&RetCode, MSG &msg + "command failed");
ENDIF;
-- verify
SET &RetCode := VERIFY /IFMS1/183.VALUE = $Input TIMEOUT '10';
IF &RetCode <> %OK THEN
EXIT (&RetCode, MSG &msg + "verification failed");
ELSE
EXIT (&RetCode, MSG &msg + "verified");
ENDIF;
END
Look into inheriting CommonTokenStream and feeding an instance of your subclass into ANTLR. From the code example that you give, I suspect that you might be interested in taking a look at the filter and the rewrite options available in version 3.
Also, take a look at this other related stack overflow question.
I have just been going through some of my old questions and thought it was worth responding with the final solution that worked the best. In the end, the best way to translate a language was to use StringTemplate. This takes care of re-indenting the output for you. There is a very good example called 'cminus' in the ANTLR example pack that shows how to use it.