Xtext grammar : mismatched input '0' expecting RULE_INT - grammar

I'm new to Xtext and I'm trying to create a simple DSL for railway systems, here's my grammar :
grammar org.xtext.railway.RailWay with org.eclipse.xtext.common.Terminals
generate railWay "http://www.xtext.org/railway/RailWay"
Model:
(trains+=Train)*
| (paths+=Path)*
| (sections+=Section)*
;
Train:
'Train' name=ID ':'
'Path' path=[Path]
'Speed' speed=INT
'end'
;
Path:
'Path' name=ID ':'
'Sections' ('{' sections+=[Section] (',' sections+=[Section] )+ '}' ) | sections+=[Section]
'end'
;
Section:
'Section' name=ID ':'
'Start' start=INT
'End' end=INT
('SpeedMax' speedMax=INT)?
'end'
;
But when I put this code at the Eclipse instance :
Section brestStBrieux :
Start 0
End 5
end
Section StBrieuxLeMan :
Start 5
End 10
end
Section leManParis :
Start 1
End 12
end
Path brestParis :
Sections { brestStBrieux, StBrieuxLeMan, leManParis}
end
Train tgv :
Path brestParis
Speed 23
end
I got this error three times:
mismatched input '0' expecting RULE_INT
mismatched input '1' expecting RULE_INT
mismatched input '5' expecting RULE_INT
I can't see where those errors come from, what can I do to fix them. Any idea?

Christian is right, since the FLOAT terminal is no longer defined, the original problem is resolved. Anyway, a remaining issue is the rule
Path:
'Path' name=ID ':'
'Sections' ('{' sections+=[Section] (',' sections+=[Section] )+ '}' ) | sections+=[Section]
'end'
;
which currently has this precedence:
Path:
(
'Path' name=ID ':' 'Sections'
('{' sections+=[Section] (',' sections+=[Section] )+ '}' )
)
|
(sections+=[Section] 'end')
;
You may want to rewrite it to
Path:
'Path' name=ID ':'
'Sections'
(
('{' sections+=[Section] (',' sections+=[Section] )+ '}' )
| sections+=[Section]
) 'end'
;

lexing and parsing are different steps. thus no using does not matter. and your grammar gets ambigous (have a look at the warnings when generating the lang) you should turn that into a datatype rule (simply omit the terminal keyword)
=> change your grammar to
grammar org.xtext.example.mydsl2.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl2/MyDsl"
Model:
(trains+=Train)*
| (paths+=Path)*
| (sections+=Section)*
;
Train:
'Train' name=ID ':'
'Path' path=[Path]
'Speed' speed=INT
'end'
;
Path:
'Path' name=ID ':'
'Sections' ('{' sections+=[Section] (',' sections+=[Section] )+ '}' ) | sections+=[Section]
'end'
;
Section:
'Section' name=ID ':'
'Start' start=INT
'End' end=INT
('SpeedMax' speedMax=INT)?
'end'
;
FLOAT : '-'? INT ('.' INT)?;

Related

How to fix extraneous input ' ' expecting, in antlr4

Hello when running antlr4 with the following input i get the following error
image showing problem
[
I have been trying to fix it by doing some changes here and there but it seems it only works if I write every component of whileLoop in a new line.
Could you please tell me what i am missing here and why the problem persits?
grammar AM;
COMMENTS :
'{'~[\n|\r]*'}' -> skip
;
body : ('BODY' ' '*) anything | 'BODY' 'BEGIN' anything* 'END' ;
anything : whileLoop | write ;
write : 'WRITE' '(' '"' sentance '"' ')' ;
read : 'READ' '(' '"' sentance '"' ')' ;
whileLoop : 'WHILE' expression 'DO' ;
block : 'BODY' anything 'END';
expression : 'TRUE'|'FALSE' ;
test : ID? {System.out.println("Done");};
logicalOperators : '<' | '>' | '<>' | '<=' | '>=' | '=' ;
numberExpressionS : (NUMBER numberExpression)* ;
numberExpression : ('-' | '/' | '*' | '+' | '%') NUMBER ;
sentance : (ID)* {System.out.println("Sentance");};
WS : [ \t\r\n]+ -> skip ;
NUMBER : [0-9]+ ;
ID : [a-zA-Z0-9]* ;
**`strong text`**
Your lexer rules produce conflicts:
body : ('BODY' ' '*) anything | 'BODY' 'BEGIN' anything* 'END' ;
vs
WS : [ \t\r\n]+ -> skip ;
The critical section is the ' '*. This defines an implicit lexer token. It matches spaces and it is defined above of WS. So any sequence of spaces is not handled as WS but as implicit token.
If I am right putting tabs between the components of whileloop will work, also putting more than one space between them should work. You should simply remove ' '*, since whitespace is to be skipped anyway.

ANTLR decision can match input such as "ID ID" using multiple alternatives

I am having a problem with the disambiguation of this parser. I would like to mention
that i am using antlrworks 1.4.3(it's a must i use it, homework assignment). I also must not use backtrack=true
It should match inputs like
main Int a, Char b, MyClass c -> Int :
expr ';'
.
.
.
expr ';'
end';'
I also comented the parser after ':' because this problem did not let me generate the code
program
: classDef+ -> ^(PROGRAM classDef+)
;
classDef
: CLASS name=ID (INHERITS parent=ID)? classBlock* END ';' ->
^(CLASS $name ^(INHERITS $parent)? classBlock*)
;
classBlock
: VAR assigmentBlock* END ';'-> ^(VAR assigmentBlock*)
| methodDecl -> ^(METHOD methodDecl)
;
methodDecl
//: name=ID methodVar* ('->' type=ID)? ':' methodBlock* END ';'
// -> ^($name methodVar* ^(RETURN $type) methodBlock*)
: name=ID methodVar* -> ^($name methodVar*)
;
methodVar
: type=ID name=ID ','? -> ^(PARAMS $type $name)
;
This is what antlrworks shows
If anyone could help me i would be much obliged.
Don't do:
methodDecl
: name=ID methodVar* ('->' type=ID)? ':' methodBlock* END ';'
;
methodVar
: type=ID name=ID ','?
;
rather do:
methodDecl
: name=ID (methodVar (',' methodVar)*)? ('->' type=ID)? ':' methodBlock* END ';'
;
methodVar
: type=ID name=ID
;
I.e. the comma should be mandatory, not optional as you defined it did.

ANTLR4 Token is not recognized when substituted

I try to modify the grammar of the sqlite syntax (I'm interested in a variant of the where clause only) and I'm keep having a weird error when substituting AND to it's own token.
grammar wtfql;
/*
SQLite understands the following binary operators, in order from highest to
lowest precedence:
||
* / %
+ -
<< >> & |
< <= > >=
= != <> IS IS NOT IN LIKE GLOB MATCH REGEXP
AND
OR
*/
start : expr EOF?;
expr
: literal_value
//BIND_PARAMETER
| ( table_name '.' )? column_name
| unary_operator expr
| expr '||' expr
| expr ( '*' | '/' | '%' ) expr
| expr ( '+' | '-' ) expr
| expr ( '<' | '<=' | '>' | '>=' ) expr
| expr ( '=' | '<>' | K_IN ) expr
| expr K_AND expr
| expr K_OR expr
| function_name '(' ( expr ( ',' expr )* )? ')'
| '(' expr ')'
| expr K_NOT expr
| expr ( K_NOT K_NULL )
| expr K_NOT? K_IN ( '(' ( expr ( ',' expr )* ) ')' )
;
unary_operator
: '-'
| '+'
| K_NOT
;
literal_value
: NUMERIC_LITERAL
| STRING_LITERAL
| K_NULL
;
function_name
: IDENTIFIER
;
table_name
: any_name
;
column_name
: any_name
;
any_name
: IDENTIFIER
| keyword
// | '(' any_name ')'
;
keyword
: K_AND
| K_NOT
| K_NULL
| K_IN
| K_OR
;
IDENTIFIER
: [a-zA-Z_] [a-zA-Z_0-9]* // TODO check: needs more chars in set
;
NUMERIC_LITERAL
: DIGIT+ ( '.' DIGIT* )? ( E [-+]? DIGIT+ )?
| '.' DIGIT+ ( E [-+]? DIGIT+ )?
;
STRING_LITERAL
: '\"' ( ~'\"' | '\"\"' )* '\"'
;
SPACES
: [ \u000B\t\r\n] -> channel(HIDDEN)
;
DOT : '.';
OPEN_PAR : '(';
CLOSE_PAR : ')';
COMMA : ',';
STAR : '*';
PLUS : '+';
MINUS : '-';
TILDE : '~';
DIV : '/';
MOD : '%';
AMP : '&';
PIPE : '|';
LT : '<';
LT_EQ : '<=';
GT : '>';
GT_EQ : '>=';
EQ : '=';
NOT_EQ2 : '<>';
K_AND : A N D;
K_NOT : N O T;
K_NULL : N U L L;
K_OR : O R;
K_IN : I N;
fragment DIGIT : [0-9];
fragment A : [aA];
fragment B : [bB];
fragment C : [cC];
fragment D : [dD];
fragment E : [eE];
fragment F : [fF];
fragment G : [gG];
fragment H : [hH];
fragment I : [iI];
fragment J : [jJ];
fragment K : [kK];
fragment L : [lL];
fragment M : [mM];
fragment N : [nN];
fragment O : [oO];
fragment P : [pP];
fragment Q : [qQ];
fragment R : [rR];
fragment S : [sS];
fragment T : [tT];
fragment U : [uU];
fragment V : [vV];
fragment W : [wW];
fragment X : [xX];
fragment Y : [yY];
fragment Z : [zZ];
writing
| expr K_AND expr
with the input
field1=1 and field2 = 2
results in
line 1:8 mismatched input 'and' expecting {<EOF>, '||', '*', '+', '-', '/', '%', '<', '<=', '>', '>=', '=', '<>', K_AND, K_NOT, K_OR, K_IN}
while
| expr 'and' expr
works like a charm:
$ antlr4 wtfql.g4 && javac -classpath /usr/local/Cellar/antlr/4.4/antlr-4.4-complete.jar wtfql*.java && cat test.txt | grun wtfql start -tree -gui
(start (expr (expr (expr (column_name (any_name feld1))) = (expr (literal_value 1))) and (expr (expr (column_name (any_name feld2))) = (expr (literal_value 2)))) <EOF>)
What am I missing?
I presume "and" is an IDENTIFIER since the rule for IDENTIFIER comes before the rule for AND and thus wins.
If you write 'and' in the parser rule this implicitly creates a token (not AND!) which comes before IDENTIFIER and thus wins.
Rule of thumb: More specific lexer rules first. Don't create new lexer tokens implicitly in parser rules.
If you check the token type, you'll get a clue what's going on.

Trying to resolve left-recursion trying to build Parser with ANTLR

I’m currently trying to build a parser for the language Oberon using Antlr and Ecplise.
This is what I have got so far:
grammar oberon;
options
{
language = Java;
//backtrack = true;
output = AST;
}
#parser::header {package dhbw.Oberon;}
#lexer::header {package dhbw.Oberon; }
T_ARRAY : 'ARRAY' ;
T_BEGIN : 'BEGIN';
T_CASE : 'CASE' ;
T_CONST : 'CONST' ;
T_DO : 'DO' ;
T_ELSE : 'ELSE' ;
T_ELSIF : 'ELSIF' ;
T_END : 'END' ;
T_EXIT : 'EXIT' ;
T_IF : 'IF' ;
T_IMPORT : 'IMPORT' ;
T_LOOP : 'LOOP' ;
T_MODULE : 'MODULE' ;
T_NIL : 'NIL' ;
T_OF : 'OF' ;
T_POINTER : 'POINTER' ;
T_PROCEDURE : 'PROCEDURE' ;
T_RECORD : 'RECORD' ;
T_REPEAT : 'REPEAT' ;
T_RETURN : 'RETURN';
T_THEN : 'THEN' ;
T_TO : 'TO' ;
T_TYPE : 'TYPE' ;
T_UNTIL : 'UNTIL' ;
T_VAR : 'VAR' ;
T_WHILE : 'WHILE' ;
T_WITH : 'WITH' ;
module : T_MODULE ID SEMI importlist? declarationsequence?
(T_BEGIN statementsequence)? T_END ID PERIOD ;
importlist : T_IMPORT importitem (COMMA importitem)* SEMI ;
importitem : ID (ASSIGN ID)? ;
declarationsequence :
( T_CONST (constantdeclaration SEMI)*
| T_TYPE (typedeclaration SEMI)*
| T_VAR (variabledeclaration SEMI)*)
(proceduredeclaration SEMI | forwarddeclaration SEMI)*
;
constantdeclaration: identifierdef EQUAL expression ;
identifierdef: ID MULT? ;
expression: simpleexpression (relation simpleexpression)? ;
simpleexpression : (PLUS|MINUS)? term (addoperator term)* ;
term: factor (muloperator factor)* ;
factor: number
| stringliteral
| T_NIL
| set
| designator '(' explist? ')'
;
number: INT | HEX ; // TODO add real
stringliteral : '"' ( ~('\\'|'"') )* '"' ;
set: '{' elementlist? '}' ;
elementlist: element (COMMA element)* ;
element: expression (RANGESEP expression)? ;
designator: qualidentifier
('.' ID
| '[' explist ']'
| '(' qualidentifier ')'
| UPCHAR )+
;
explist: expression (COMMA expression)* ;
actualparameters: '(' explist? ')' ;
muloperator: MULT | DIV | MOD | ET ;
addoperator: PLUS | MINUS | OR ;
relation: EQUAL ; // TODO
typedeclaration: ID EQUAL type ;
type: qualidentifier
| arraytype
| recordtype
| pointertype
| proceduretype
;
qualidentifier: (ID '.')* ID ;
arraytype: T_ARRAY expression (',' expression) T_OF type;
recordtype: T_RECORD ('(' qualidentifier ')')? fieldlistsequence T_END ;
fieldlistsequence: fieldlist (SEMI fieldlist) ;
fieldlist: (identifierlist COLON type)? ;
identifierlist: identifierdef (COMMA identifierdef)* ;
pointertype: T_POINTER T_TO type ;
proceduretype: T_PROCEDURE formalparameters? ;
variabledeclaration: identifierlist COLON type ;
proceduredeclaration: procedureheading SEMI procedurebody ID ;
procedureheading: T_PROCEDURE MULT? identifierdef formalparameters? ;
formalparameters: '(' params? ')' (COLON qualidentifier)? ;
params: fpsection (SEMI fpsection)* ;
fpsection: T_VAR? idlist COLON formaltype ;
idlist: ID (COMMA ID)* ;
formaltype: (T_ARRAY T_OF)* (qualidentifier | proceduretype);
procedurebody: declarationsequence (T_BEGIN statementsequence)? T_END ;
forwarddeclaration: T_PROCEDURE UPCHAR? ID MULT? formalparameters? ;
statementsequence: statement (SEMI statement)* ;
statement : assignment
| procedurecall
| ifstatement
| casestatement
| whilestatement
| repeatstatement
| loopstatement
| withstatement
| T_EXIT
| T_RETURN expression?
;
assignment: designator ASSIGN expression ;
procedurecall: designator actualparameters? ;
ifstatement: T_IF expression T_THEN statementsequence
(T_ELSIF expression T_THEN statementsequence)*
(T_ELSE statementsequence)? T_END ;
casestatement: T_CASE expression T_OF caseitem ('|' caseitem)*
(T_ELSE statementsequence)? T_END ;
caseitem: caselabellist COLON statementsequence ;
caselabellist: caselabels (COMMA caselabels)* ;
caselabels: expression (RANGESEP expression)? ;
whilestatement: T_WHILE expression T_DO statementsequence T_END ;
repeatstatement: T_REPEAT statementsequence T_UNTIL expression ;
loopstatement: T_LOOP statementsequence T_END ;
withstatement: T_WITH qualidentifier COLON qualidentifier T_DO statementsequence T_END ;
ID : ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ;
fragment DIGIT : '0'..'9' ;
INT : ('-')?DIGIT+ ;
fragment HEXDIGIT : '0'..'9'|'A'..'F' ;
HEX : HEXDIGIT+ 'H' ;
ASSIGN : ':=' ;
COLON : ':' ;
COMMA : ',' ;
DIV : '/' ;
EQUAL : '=' ;
ET : '&' ;
MINUS : '-' ;
MOD : '%' ;
MULT : '*' ;
OR : '|' ;
PERIOD : '.' ;
PLUS : '+' ;
RANGESEP : '..' ;
SEMI : ';' ;
UPCHAR : '^' ;
WS : ( ' ' | '\t' | '\r' | '\n'){skip();};
My problem is when I check the grammar I get the following error and just can’t find an appropriate way to fix this:
rule statement has non-LL(*) decision
due to recursive rule invocations reachable from alts 1,2.
Resolve by left-factoring or using syntactic predicates
or using backtrack=true option.
|---> statement : assignment
Also I have the problem with declarationsequence and simpleexpression.
When I use options { … backtrack = true; … } it at least compiles, but obviously doesn’t work right anymore when I run a test-file, but I can’t find a way to resolve the left-recursion on my own (or maybe I’m just too blind at the moment because I’ve looked at this for far too long now). Any ideas how I could change the lines where the errors occurs to make it work?
EDIT
I could fix one of the three mistakes. statement works now. The problem was that assignment and procedurecall both started with designator.
statement : procedureassignmentcall
| ifstatement
| casestatement
| whilestatement
| repeatstatement
| loopstatement
| withstatement
| T_EXIT
| T_RETURN expression?
;
procedureassignmentcall : (designator ASSIGN)=> assignment | procedurecall;
assignment: designator ASSIGN expression ;
procedurecall: designator actualparameters? ;

ANTLR error : java.lang.NoSuchFieldError: offendingToken

I have written the following grammar file and am getting the following error. I have done many google search and some answer says that there something wrong in the grammar. But the following error message does not indicate the specific place, where is the error message possibly. Can you please advise, why I am getting the error, described below.
grammar ArchSpec;
options {
language = Java;
}
#lexer::header {
package iotsuite.parser;
}
#parser::header {
package iotsuite.parser;
import iotsuite.compiler.*;
import iotsuite.semanticmodel.*;
}
#members {
private SymbolTable context;
}
archSpec :
('structs' ':' struct_def)*
'softwarecomponents' ':' (component_def)+
;
component_def :
'computationalService' ':' (cs_def)+
;
struct_def:
CAPITALIZED_ID
(structField_def ';')+
;
structField_def:
lc_id ':' dataType
;
cs_def:
CAPITALIZED_ID
(csGeneratedInfo_def ';')+
(csConsumeInfo_def ';')*
(csRequest_def ';')*
(cntrlCommand_def ';')*
(partition_def ';')+
;
csGeneratedInfo_def:
'generate' lc_id ':' CAPITALIZED_ID
;
csConsumeInfo_def:
'consume' lc_id ('from' 'region-hops' ':' INT ':' CAPITALIZED_ID )?
;
csRequest_def :
'request' lc_id
;
cntrlCommand_def :
'command' name = CAPITALIZED_ID '(' (cntrlParameter_def)? ')' 'to' 'region-hops' ':' INT ':' CAPITALIZED_ID
;
cntrlParameter_def :
lc_id (',' parameter_def )?
;
partition_def:
csDeploymentConstraint='partition-per' ':' CAPITALIZED_ID
;
lc_id: ID
;
dataType:
primitiveType
;
primitiveType:
(id='Integer' | id='Boolean' | id='String' | id = 'double' | id = 'long' | id='boolean' )
;
ID : 'a'..'z' ('a'..'z' | 'A'..'Z' )*
;
INT : '0'..'9'('0'..'9')* ;
CAPITALIZED_ID: 'A'..'Z' ('a'..'z' | 'A'..'Z' )*;
WS: ('\t' | ' ' | '\r' | '\n' | '\u000C')+ {$channel = HIDDEN;};
Error Message I am getting on console is the following.
java.lang.NoSuchFieldError: offendingToken
at org.deved.antlride.runtime.AntlrErrorListener.extractToken(AntlrErrorListener.java:111)
at org.deved.antlride.runtime.AntlrErrorListener.report(AntlrErrorListener.java:79)
at org.deved.antlride.runtime.AntlrErrorListener.message(AntlrErrorListener.java:63)
at org.deved.antlride.runtime.AntlrErrorListener.error(AntlrErrorListener.java:53)
at org.antlr.tool.ErrorManager.grammarError(ErrorManager.java:742)
at org.antlr.tool.ErrorManager.grammarError(ErrorManager.java:750)
at org.antlr.tool.NameSpaceChecker.lookForReferencesToUndefinedSymbols(NameSpaceChecker.java:133)
at org.antlr.tool.NameSpaceChecker.checkConflicts(NameSpaceChecker.java:72)
at org.antlr.tool.Grammar.checkNameSpaceAndActions(Grammar.java:804)
at org.antlr.tool.CompositeGrammar.defineGrammarSymbols(CompositeGrammar.java:374)
at org.antlr.Tool.process(Tool.java:484)
at org.deved.antlride.runtime.Tool2.main(Tool2.java:24)
An instance of input I parse is the following:
softwarecomponents:
computationalService:
RoomAvgTemp
generate roomAvgTempMeasurement:TempStruct;
consume tempMeasurement from region-hops:0:Room;
partition-per : Room;
RoomController
consume roomAvgTempMeasurement from region-hops:0:Room;
command SetTemp(setTemp) to region-hops:0:Room;
partition-per : Room;