Grammar for a boolean calculator language - grammar

I am writing a grammar for a boolean calculator language. Programmes written in this language will consist of atmost one statement whose result will be boolean.
An example statement of such language is given below:
( A + B >= C ) AND ( D == 4 )
The grammar that I have come up with so far is given below:
E => T COP T | EXPR BOP EXPR
T => VAR | EXPR
VAR => A | B | C | D
EXPR => AEXPR | CEXPR
AEXPR => "(" VAR AOP VAR ")"
CEXPR => VAR COP AEXPR | AEXPR COP AEXPR
AOP => + | - | * | / | %
BOP => AND | OR
COP => == | != | < | > | <= | >=
The main thing to consider here is that no two VARs can be joined in a binary operation BOP and that the statement must return a boolean.
I want to know if the above grammar satisfies this criteria or am I missing something?
Any help will be appereciated.
Thanks

The main thing to consider here is that no two VARs can be joined in a binary operation BOP and that the statement must return a boolean.
The above requirement fails in following case:
E = EXPR BOP EXPR
EXPR = AEXPR
=> E = AEXPR BOP AEXPR
I am assuming here that (VAR AOP VAR) yields a VAR.
A more standardized form of your required grammar is given below:
E = T COP T | EXPR BOP EXPR;
T = VAR | AEXPR;
VAR = "A" | "B" | "C" | "D";
AEXPR = "(" VAR AOP VAR ")" | "(" VAR AOP AEXPR ")";
EXPR = VAR COP AEXPR | AEXPR COP AEXPR;
AOP = "+" | "-" | "*" | "/" | "%";
BOP = "AND" | "OR";
COP = "==" | "!=" | "<" | ">" | "<=" | ">=";

Related

ANTLR grammar not picking the right option

So, I'm trying to assign a method value to a var in a test program, I'm using a Decaf grammar.
The grammar:
// Define decaf grammar
grammar Decaf;
// Reglas LEXER
// Definiciones base para letras y digitos
fragment LETTER: ('a'..'z'|'A'..'Z'|'_');
fragment DIGIT: '0'..'9';
// Las otras reglas de lexer de Decaf
ID: LETTER (LETTER|DIGIT)*;
NUM: DIGIT(DIGIT)*;
CHAR: '\'' ( ~['\r\n\\] | '\\' ['\\] ) '\'';
WS : [ \t\r\n\f]+ -> channel(HIDDEN);
COMMENT
: '/*' .*? '*/' -> channel(2)
;
LINE_COMMENT
: '//' ~[\r\n]* -> channel(2)
;
// -----------------------------------------------------------------------------------------------------------------------------------------
// Reglas PARSER
program:'class' 'Program' '{' (declaration)* '}';
declaration
: structDeclaration
| varDeclaration
| methodDeclaration
;
varDeclaration
: varType ID ';'
| varType ID '[' NUM ']' ';'
;
structDeclaration:'struct' ID '{' (varDeclaration)* '}' (';')?;
varType
: 'int'
| 'char'
| 'boolean'
| 'struct' ID
| structDeclaration
| 'void'
;
methodDeclaration: methodType ID '(' (parameter (',' parameter)*)* ')' block;
methodType
: 'int'
| 'char'
| 'boolean'
| 'void'
;
parameter
: parameterType ID
| parameterType ID '[' ']'
| 'void'
;
parameterType
: 'int'
| 'char'
| 'boolean'
;
block: '{' (varDeclaration)* (statement)* '}';
statement
: 'if' '(' expression ')' block ( 'else' block )? #stat_if
| 'while' '('expression')' block #stat_else
| 'return' expressionOom ';' #stat_return
| methodCall ';' #stat_mcall
| block #stat_block
| location '=' expression #stat_assignment
| (expression)? ';' #stat_line
;
expressionOom: expression |;
location: (ID|ID '[' expression ']') ('.' location)?;
expression
: location #expr_loc
| methodCall #expr_mcall
| literal #expr_literal
| '-' expression #expr_minus // Unary Minus Operation
| '!' expression #expr_not // Unary NOT Operation
| '('expression')' #expr_parenthesis
| expression arith_op_fifth expression #expr_arith5 // * / % << >>
| expression arith_op_fourth expression #expr_arith4 // + -
| expression arith_op_third expression #expr_arith3 // == != < <= > >=
| expression arith_op_second expression #expr_arith2 // &&
| expression arith_op_first expression #expr_arith1 // ||
;
methodCall: ID '(' arg1 ')';
// Puede ir algo que coincida con arg2 o nada, en caso de una llamada a metodo sin parametro
arg1: arg2 |;
// Expression y luego se utiliza * para permitir 0 o más parametros adicionales
arg2: (arg)(',' arg)*;
arg: expression;
// Operaciones
// Divididas por nivel de precedencia
// Especificación de precedencia: https://anoopsarkar.github.io/compilers-class/decafspec.html
rel_op : '<' | '>' | '<=' | '>=' ;
eq_op : '==' | '!=' ;
arith_op_fifth: '*' | '/' | '%' | '<<' | '>>';
arith_op_fourth: '+' | '-';
arith_op_third: rel_op | eq_op;
arith_op_second: '&&';
arith_op_first: '||';
literal : int_literal | char_literal | bool_literal ;
int_literal : NUM ;
char_literal : '\'' CHAR '\'' ;
bool_literal : 'true' | 'false' ;
And the test program is as follows:
class Program
{
int factorial(int b)
{
int n;
n = 1;
return n+2;
}
void main(void)
{
int a;
int b;
b=0;
a=factorial(b);
factorial(b);
return;
}
}
The parse tree for this program looks as following, at least for the part I'm interested which is a=factorial(b):
This tree is wrong, since it should look like location = expression -> methodCall
The following tree is how it looks on a friend's implementation, and it should sort of look like this if the grammar was correctly implemented:
This is correctly implemented, or the result I need, since I want the tree to look like location = expression -> methodCall and not location = expression -> location. If I remove the parameter from a=factorial(b) and leave it as a=factorial(), it will be read correctly as a methodCall, so I'm not sure what I'm missing.
So my question is, I'm not sure where I'm messing up in the grammar, I guess it's either on location or expression, but I'm not sure how to adjust it to behave the way I want it to. I sort of just got the rules literally from the specification we were provided.
In an ANTLR rule, alternatives are matches from top to bottom. So in your expression rule:
expression
: location #expr_loc
| methodCall #expr_mcall
...
;
the generated parser will try to match a location before it tries to match a methodCall. Try swapping those two around:
expression
: methodCall #expr_mcall
| location #expr_loc
...
;

mincaml grammar in antlr4

I am trying to write mincaml parser in antlr4. github(https://github.com/esumii/min-caml/blob/master/parser.mly).
Japanese site : http://esumii.github.io/min-caml/ .
here is antlr 4 code.
grammar MinCaml;
simple_exp: #simpleExp
| LPAREN exp RPAREN #parenExp
| LPAREN RPAREN #emptyParen
| BOOL #boolExpr
| INT #intExpr
| FLOAT #floatExpr
| IDENT #identExpr
| simple_exp DOT LPAREN exp RPAREN #arrayGetExpr
;
exp : #programExp
| simple_exp #simpleExpInExp
| NOT exp #notExp
| MINUS exp #minusExp
| MINUS_DOT exp #minusFloatExp
| left = exp op = (AST_DOT | SLASH_DOT) right = exp #astSlashExp
| left = exp op = (PLUS | MINUS | MINUS_DOT | PLUS_DOT) right = exp #addSubExp
| left = exp op = (EQUAL | LESS_GREATER | LESS | GREATER | LESS_EQUAL | GREATER_EQUAL) right = exp #logicExp
| IF condition = exp THEN thenExp = exp ELSE elseExp = exp #ifExp
| LET IDENT EQUAL exp IN exp #letExp
| LET REC fundef IN exp #letRecExp
| exp actual_args #appExp
| exp COMMA exp elems #tupleExp
| LET LPAREN pat RPAREN EQUAL exp IN exp #tupleReadExp
| simple_exp DOT LPAREN exp RPAREN LESS_MINUS exp #putExp
| exp SEMICOLON exp #expSeqExp
| ARRAY_CREATE simple_exp simple_exp #arrayCreateExp
;
fundef:
| IDENT formal_args EQUAL exp
;
formal_args:
| IDENT formal_args
| IDENT
;
actual_args:
| actual_args simple_exp
| simple_exp
;
elems:
| COMMA exp elems
|
;
pat:
| pat COMMA IDENT
| IDENT COMMA IDENT
;
LET : 'let';
REC : 'rec';
IF : 'if';
THEN : 'then';
ELSE : 'else';
IN : 'in';
IDENT : '_' | [a-z][a-zA-Z0-9_]+;
ARRAY_CREATE : 'Array.create';
LPAREN : '(';
RPAREN : ')';
BOOL : 'true' 'false';
NOT : 'not';
INT : ['1'-'9'] (['0'-'9'])*;
FLOAT : (['0'-'9'])+ ('.' (['0'-'9'])*)? (['e', 'E'] (['+', '-'])? (['0'-'9'])+)?;
MINUS : '-';
PLUS : '+';
MINUS_DOT : '-.';
PLUS_DOT : '+.';
AST_DOT : '*.';
SLASH_DOT : '/.';
EQUAL : '=';
LESS_GREATER : '';
LESS_EQUAL : '=';
LESS : '';
DOT : '.';
LESS_MINUS : ' skip ; // toss out whitespace
COMMENT : '(*' .*? '*)' -> skip;
but I get following errors on rules exp and actual args.
error(148): MinCaml.g4:13:0: left recursive rule exp contains a left recursive alternative which can be followed by the empty string
error(148): MinCaml.g4:41:0: left recursive rule actual_args contains a left recursive alternative which can be followed by the empty string
But I don't see a any possibility of empty string on both rules. Or am I wrong?
What is wrong with this code?
The first line of the exp rule (actually of every rule) is the likely problem:
exp : #programExp
The standard rule form is
r: alt1 | alt2 | .... | altN ;
The alt1s in the grammar are all empty. An empty alt "matches an empty string".
Given the elems rule appears to have an intentional empty alt, consider that, in general terms, rules with empty alts can be problematic. Rather than using an empty alt, make the corresponding element in the parent rule optional (either ? or *).

ANTLR4 - Access token group in a sequence using context

I have a grammar that includes this rule:
expr:
unaryExpr '(' (stat | expr | constant) ')' #labelUnaryExpr
| binaryExpr '(' (stat | expr | constant) ',' (stat | expr | constant) ')' #labelBinaryExpr
| multipleExpr '(' (stat | expr | constant) (',' (stat | expr | constant))+ ')' #labelMultipleExpr
;
For expr, I can access the value of unaryExpr by calling ctx.unaryStat(). How can I access (stat | expr | constant) similarly? Is there a solution that doesn't require modifying my grammar by adding another rule for the group?
Since you've labelled you alternatives, you can access the (stat | expr | constant) in its respective listener/visitor method:
#Override
public void enterLabelUnaryExpr(#NotNull ExprParser.LabelUnaryExprContext ctx) {
// one of these will return something other than null
System.out.println(ctx.stat());
System.out.println(ctx.expr());
System.out.println(ctx.constant());
}

Perl, SQL query and save data to csv file with correct Datetime format

The data table has too many columns to select one by one, so I am trying to pull entire data into a file.
There are a number of columns contains datetime for UTC and local time.
When I used the following script, all hour information is deleted and only date is saved. How can I easily fix the code to save the entire datetime information?
In summery, all datetime data in csv file was saved as "25-FEB-15" instead of "25-FEB-15 HH:MM:SS AM -08:00"
open(OUTFILE, "> ./outputfile.csv");
my($dbh,$sth);
$dbh = DBI->connect("xxx")
my $sqlGetEid = "
select *
from Table_1
where MARKET = 'Chicago' and DATETIMELOCAL >= '22-FEB-2015' and DATETIMELOCAL < '01-MAR-2015'
";
my $curSqlEid = $dbh->prepare($sqlGetEid);
$curSqlEid->execute();
my $counter = 0;
my $delimiter = ',';
my $fields = join(',', #{ $curSqlEid->{NAME_lc} });
print "$fields\n";
printf OUTFILE "$fields\n";
while (my #row = $curSqlEid->fetchrow_array) {
my $csv = join(',', #row)."\n";
printf OUTFILE "$csv";
$counter ++;
if($counter % 10000 == 0){
print $csv, "\n";
}
}
I have fixed codes according to the comments below, but the problem is not resolved yet. The DB I am using is Oracle, so MySQL based codes seems not compatible. The problem is narrow down to the datetime formats that Oracle uses, so by properly handling the format this problem can be resolved. But I am not sure which perl package I have to use to nicely handle Oracle datatime formats.
use DBI;
use DBD::Oracle qw(:ora_types);
use Compress::Zlib;
use FileHandle;
use strict;
use warnings;
use DateTime;
use Data::Dumper;
use Text::CSV_XS;
use DateTime::Format::DateParse;
use DateTime::Format::DBI;
open(OUTFILE, "> ./output.csv");
my $dbh = DBI->connect("xxx")
my $sth = $dbh->prepare("SELECT * FROM Table_1");
$sth->execute;
my $fields = join(',', #{ $sth->{NAME_lc} });
#EXTRACT COLUMN NAMES
my #col_names = #{ $sth->{NAME_lc} } ;
my $column_names = join ", ", #col_names;
my $select = "
select $column_names
from Table_1
where MARKET = 'Chicago' and DATETIMELOCAL >= '22-FEB-2015' and DATETIMELOCAL < '01-MAR-2015'
";
my $query = $dbh->prepare($select);
$query->execute;
#WRITE DATA TO CSV FILE
my $csv_attributes = {
binary => 1, #recommneded
eol => $/, #recommended(I wonder why it's not $\ ?)
};
my $csv = Text::CSV_XS->new($csv_attributes);
my $fname = 'data.csv';
open my $OUTFILE, ">", $fname
or die "Couldn't open $fname: $!";
$csv->print($OUTFILE, \#col_names);
$csv->column_names(#col_names); #Needed for print_hr() below
while (my $row = $query->fetchrow_arrayref) {
print $row->[0];# $row->{datetimelocal} does not work
my $datetime = DateTime::Format::DBI->parse_datetime( $row->[0]) ; # This might be wrong
$row->[0] = strftime("%d-%b-%y %I:%M:%S %p", $datetime);
$csv->print($OUTFILE, $row);
}
close $OUTFILE;
$query->finish;
$dbh->disconnect;
Select each column explicitly. For the date column select a TO_DATE of the column with the format you want
The data table has too many columns to select one by one
Not an issue. We're computer programmers after all.
In summery, all datetime data in csv file was saved as "25-FEB-15"
instead of "25-FEB-15 HH:MM:SS AM -08:00"
I'm not seeing that:
use strict;
use warnings;
use 5.012;
use Data::Dumper;
use DBI;
use DBD::mysql;
use Text::CSV_XS;
#CONFIG VARIABLES
my $db_type = "mysql";
my $database = "my_db";
my $host = "localhost";
my $port = "3306";
my $user = "root";
my $pword = "";
#DATA SOURCE NAME
my $dsn = "dbi:$db_type:$database:$host:$port";
#PERL DBI CONNECT
my $dbh = DBI->connect($dsn, $user, $pword);
my $tablename = "Table_1";
#CONDITIONALLY DROP THE TABLE
my $drop_table = "drop table if exists $tablename;";
my $query = $dbh->prepare($drop_table);
$query->execute();
#CONDITIONALLY CREATE THE TABLE
my $create_table =<<"END_OF_CREATE";
create table $tablename (
id INT(12) not null auto_increment primary key,
open DECIMAL(6,4),
high DECIMAL(6,4),
low DECIMAL(6,4),
close DECIMAL(6,4),
market VARCHAR(40),
datetimelocal DATETIME
)
END_OF_CREATE
$query = $dbh->prepare($create_table);
$query->execute();
#INSERT DATA INTO TABLE
my $insert =<<"END_OF_INSERT";
insert into $tablename(open, high, low, close, market, datetimelocal)
values (?, ?, ?, ?, ?, ?)
END_OF_INSERT
my #data = (
[10.00, 12.00, 9.00, 11.50, 'Chicago', '2015-2-23 16:00:01'],
[10.00, 12.01, 9.01, 11.51, 'New York', '2015-2-23 16:00:01'],
);
for my $aref (#data) {
$query = $dbh->prepare($insert);
$query->execute(#$aref);
}
#PREPARE COLUMN NAME QUERY
my $select =<<"END_OF_SELECT";
SELECT column_name
FROM information_schema.columns
WHERE table_name='Table_1';
END_OF_SELECT
$query = $dbh->prepare($select);
$query->execute;
#EXTRACT COLUMN NAMES
my #col_names = #{$dbh->selectcol_arrayref($query)};
my $column_names = join ", ", #col_names;
#PREPARE SELECT QUERY
$select =<<"END_OF_SELECT";
select $column_names from $tablename
where MARKET = 'Chicago' and DATETIMELOCAL >= '2015-2-22' and DATETIMELOCAL < '2015-3-1'
END_OF_SELECT
$query = $dbh->prepare($select);
$query->execute;
#WRITE DATA TO CSV FILE
my $csv_attributes = {
binary => 1, #recommneded
eol => $/, #recommended(I wonder why it's not $\ ?)
};
my $csv = Text::CSV_XS->new($csv_attributes);
my $fname = 'data.csv';
open my $OUTFILE, ">", $fname
or die "Couldn't open $fname: $!";
$csv->print($OUTFILE, \#col_names);
while (my $row = $query->fetchrow_arrayref) {
$csv->print($OUTFILE, $row);
}
close $OUTFILE;
$query->finish;
$dbh->disconnect;
db table:
mysql> describe Table_1;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | int(12) | NO | PRI | NULL | auto_increment |
| open | decimal(6,4) | YES | | NULL | |
| high | decimal(6,4) | YES | | NULL | |
| low | decimal(6,4) | YES | | NULL | |
| close | decimal(6,4) | YES | | NULL | |
| market | varchar(40) | YES | | NULL | |
| datetimelocal | datetime | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
7 rows in set (0.19 sec)
mysql> select * from Table_1;
+----+---------+---------+--------+---------+----------+---------------------+
| id | open | high | low | close | market | datetimelocal |
+----+---------+---------+--------+---------+----------+---------------------+
| 1 | 10.0000 | 12.0000 | 9.0000 | 11.5000 | Chicago | 2015-02-23 16:00:01 |
| 2 | 10.0000 | 12.0100 | 9.0100 | 11.5100 | New York | 2015-02-23 16:00:01 |
+----+---------+---------+--------+---------+----------+---------------------+
2 rows in set (0.00 sec)
output:
$ cat data.csv
id,open,high,low,close,market,datetimelocal
1,10.0000,12.0000,9.0000,11.5000,Chicago,"2015-02-23 16:00:01"
all datetime data in csv file was saved as "25-FEB-15" instead of
"25-FEB-15 HH:MM:SS AM -08:00"
Note that a mysql DATETIME column type does not save timezone offset information. If you want to format your datetime and add a timezone offset, you can do something like this:
...
...
use DateTime::Format::MySQL;
use DateTime::Format::Strptime qw{ strftime };
...
...
#WRITE DATA TO CSV FILE
my $csv_attributes = {
binary => 1, #recommneded
eol => $/, #recommended(I wonder why it's not $\ ?)
};
my $csv = Text::CSV_XS->new($csv_attributes);
my $fname = 'data.csv';
open my $OUTFILE, ">", $fname
or die "Couldn't open $fname: $!";
$csv->print($OUTFILE, \#col_names);
$csv->column_names(#col_names); #Needed for print_hr() below
my $tz_offset = "-08:00";
while (my $row = $query->fetchrow_hashref) {
my $datetime = DateTime::Format::MySQL->parse_datetime(
$row->{datetimelocal}
);
#strftime() comes from DateTime::Format::Strptime:
$row->{datetimelocal} = strftime(
"%d-%b-%y %I:%M:%S %p $tz_offset",
$datetime
);
$csv->print_hr($OUTFILE, $row); #=>print hash ref. To get the column order right, you first have to set the column order with $csv->column_names().
}
close $OUTFILE;
$query->finish;
$dbh->disconnect;
output:
$ cat data.csv
id,open,high,low,close,market,datetimelocal
1,10.0000,12.0000,9.0000,11.5000,Chicago,"23-Feb-15 04:00:01 PM -08:00"

Tree Rewrite - whole subtree not just the top node should become root

I want that the tree rewrite of *addition_operator* contains the whole subtree not only the top node, so that *hint_keywords* are still in the tree.
addition is so complex because I want to add the T_LEFT and T_RIGHT in the tree.
antlr 3.3
grammar:
grammar Test;
options {
output = AST;
}
tokens {
T_LEFT;
T_RIGHT;
T_MARKER;
}
#lexer::header {
package com.spielwiese;
}
#header {
package com.spielwiese;
}
NUM : '0' .. '9' ( '0' .. '9' )*;
ASTERISK : '*';
PLUS : '+';
MINUS : '-';
WS : (' '|'\r'|'\t'|'\n') {skip();};
addition
:
(a=atom -> $a)
(
addition_operator b=atom
->
^(addition_operator
^(T_LEFT $addition)
^(T_RIGHT $b)
)
)+
;
atom
: NUM
| '(' addition ')' -> addition
;
addition_operator
: PLUS hints? -> ^(PLUS hints?)
| MINUS hints? -> ^(MINUS hints?)
;
hints
: '[' hint_keywords += hint_keyword (',' hint_keywords += hint_keyword)* ']'
->
$hint_keywords
;
hint_keyword
: 'FAST'
| 'SLOW'
| 'BIG'
| 'THIN'
;
As far as I can see the reason is the implementation of RewriteRuleSubtreeStream#nextNode() which uses adaptor.dupNode(tree) and I want a adaptor.dupTree(tree).
given input
2 + [BIG] 3 - [FAST, THIN] 4
is:
+---------+
| - |
+---------+
| \
| \
T_LEFT T_RIGHT
| |
+---------+
| + | 4
+---------+
| \
T_LEFT T_RIGHT
| |
2 3
and should be
+---------+
| - |
+---------+
/ / | \
/ / | \
FAST THIN T_LEFT T_RIGHT
| |
+---------+
| + | 4
+---------+
/ | \
/ T_LEFT T_RIGHT
BIG | |
2 3
Try this:
grammar Test;
options {
output=AST;
}
tokens {
T_MARKER;
T_LEFT;
T_RIGHT;
}
calc
: addition EOF -> addition
;
addition
: (a=atom -> $a) ( Add markers b=atom -> ^(Add markers ^(T_LEFT $addition) ^(T_RIGHT $b))
| Sub markers b=atom -> ^(Sub markers ^(T_LEFT $addition) ^(T_RIGHT $b))
)*
;
markers
: ('[' marker (',' marker)* ']')? -> ^(T_MARKER marker*)
;
marker
: Fast
| Thin
| Big
;
atom
: Num
| '(' addition ')' -> addition
;
Fast : 'FAST';
Thin : 'THIN';
Big : 'BIG';
Num : '0'..'9' ('0'..'9')*;
Add : '+';
Sub : '-';
Space : (' ' | '\t' | '\r' | '\n') {skip();};
which parses the input 2 + [BIG] 3 - [FAST, THIN] 4 + 5 into the following AST:
The trick was to use $addition in the rewrite rule to reference the entire rule itself.