how to check values while using recursion in yacc? - yacc

I am using recursion in yacc and I want to check all the values that are parsed by the recursion rule.My yacc rule is
%{
#include<stdio.h>
.
.
.
%}
%%
abc:ABC expr
;
expr:VALUE','expr
|VALUE
;
%%
if i have a statement like
ABC 1,2,3,4
it gets parsed.
i want to check that all the numbers parsed via expr
have sum equal to some value say 10
how can i check this?

Edit:
You can count the values parsed and keep their running total with code that goes something like this:
%{
#include<stdio.h>
int count;
.
.
.
%}
%%
abc: { count = 0; } ABC expr { printf("count: %d; sum: %d\n", count, $2); }
;
expr: VALUE ',' expr { $$ = $1 + $3; }
| VALUE { $$ = $1; count++; }
;
%%

Related

How to fix implicit declaration of function 'yyerror' problem

Hi I am making a program that does simple arithmetic operations using Lex and yacc, but I am having a problem with a specific error.
ex1.y
%{
#include <stdio.h>
int sym[26];
%}
%token INTEGER VARIABLE
%left '+' '-'
%left '*' '/' '%'
%%
program:
program statement '\n'
|
;
statement:
expr {printf("%d\n", $1);}
| VARIABLE '=' expr {sym[$1] = $3;}
;
expr:
INTEGER
| VARIABLE { $$ = sym[$1];}
| expr '+' expr { $$ = $1 + $3;}
| expr '-' expr { $$ = $1 - $3;}
| expr '*' expr { $$ = $1 * $3;}
| expr '/' expr { $$ = $1 / $3;}
| '(' expr ')' { $$ = $2;}
;
%%
main() { return yyparse();}
int yyerror(char *s){
fprintf(stderr,"%s\n",s);
return 0;
}
ex1.l
%{
#include <stdlib.h>
#include "y.tab.h"
%}
%%
/* variables */
[a-z] {
yylval = *yytext -'a';
return VARIABLE;
}
/* integers */
[0-9]+ {
yylval = atoi(yytext);
return INTEGER;
}
/* operators */
[-+()=/*\n] { return *yytext;}
/* skip whitespace */
[ \t] ;
/* anything else is an error */
. yyerror("invalid character");
%%
int yywrap (void){
return 1;
}
when I execute bellow instruction
$bison –d -y ex1.y
$lex ex1.l
$gcc lex.yy.c y.tab.c –o ex1
The following error occurs:
ex1.l: In function ‘yylex’:
ex1.l:28:1: warning: implicit declaration of function ‘yyerror’; did you mean ‘perror’? [-Wimplicit-function-declaration]
28 |
| ^
| perror
y.tab.c: In function ‘yyparse’:
y.tab.c:1227:16: warning: implicit declaration of function ‘yylex’ [-Wimplicit-function-declaration]
1227 | yychar = yylex ();
| ^~~~~
y.tab.c:1402:7: warning: implicit declaration of function ‘yyerror’; did you mean ‘yyerrok’? [-Wimplicit-function-declaration]
1402 | yyerror (YY_("syntax error"));
| ^~~~~~~
| yyerrok
I don't know what is wrong with my code. I would appreciate it if you could tell me how to fix the above error.
The version of bison you are using requires you to declare prototypes for yylex() and yyerror. These should go right after the #include <stdio.h> at the top of the file:
int yylex(void);
int yyerror(char* s);
I would use int yyerror(const char* s) as the prototype for yyerror, because it is more accurate, but if you do that you'll have to make the same change in the definition.
You use yyerror in your lex file, so you will have to add its declaration in that file as well.
main() hasn't been a valid prototype any time this century. Return types are required in function declarations, including main(). So I guess you are basing your code on a very old template. There are better starting points in the examples in the bison manual.
(And don't expect it to be easy to work with parser generators if you have no experience with C.)

Calculator in lex and yacc

I am trying to create a calculator by using lex and yacc. However I can not understand how can I give operator precedence to this program? I could not find any information about it. Which code do I need to add to my project to calculate correctly?
Yacc file is:
%{
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
int yylex();
void yyerror(const char *s);
%}
%token INTEGER
%left '*' '/'
%left '+' '-'
%%
program:
program line | line
line:
expr ';' { printf("%d\n",$1); } ; | '\n'
expr:
expr '+' term { $$ = $1 + $3; }
| expr '-' term { $$ = $1 - $3; }
| expr '*' term { $$ = $1 * $3; }
| expr '/' term { $$ = $1 / $3; }
| expr '%' term { $$ = $1 % $3; }
| expr '^' term { $$ = $1 ; }
| term { $$ = $1; }
term:
INTEGER { $$ = $1; }
%%
void yyerror(const char *s) { fprintf(stderr,"%s\n",s); return ; }
int main(void) { /*yydebug=1;*/ yyparse(); return 0; }
Lex file is:
%{
#include <stdlib.h>
#include <stdio.h>
void yyerror(char*);
extern int yylval;
#include "calc.tab.h"
#include<time.h>
%}
%%
[ \t]+ ; //skip whitespace
[0-9]+ {yylval = atoi(yytext); return INTEGER;}
[-+*/%^] {return *yytext;}
\n {return *yytext;}
; {return *yytext;}
. {char msg[25]; sprintf(msg,"%s <%s>","invalid character",yytext); yyerror(msg);}
%left '*' '/'
%left '+' '-'
Precedence declarations are specified in the order from lowest precedence to highest. So in the above code you give * and / the lowest precedence level and + and - the highest. That's the opposite order of what you want, so you'll need to switch the order of these two lines. You'll also want to add the operators % and ^, which are currently part of your grammar, but not your precedence annotations.
With those changes, you'll now have specified the precedence you want, but it won't take effect yet. Why not? Because precedence annotations are used to resolve ambiguities, but your grammar isn't actually ambiguous.
The way you've written the grammar, with only the left operand of all operators being expr and the right operand being term, there's only one way to derive an expression like 2+4*2, namely by deriving 2+4 from expr and 2 from term (because deriving 4*2 from term would be impossible since term can only match a single number). So your grammar treats all operators as left-associative and having the same precedence and your precedence annotations aren't considered at all.
In order for the precedence annotations to be considered, you'll have to change your grammar, so that both operands of the operators are expr (e.g. expr '+' expr instead of expr '+' term). Written like that an expression like 2+4*2 could either be derived by deriving 2+4 from expr as the left operand and 2 from expr as the right operand or 2 as the left and 4*2 as the right and this ambiguity will be resolved using your precedence annotations.

how to fix integer out of range error :$3 error in YACC

I am developing a calculator using YACC and I receive this error :
Integer out of rang $3;
I have just now started learning yacc and can't rectify the error I can see the question already but no one has answered
%token NUMBER
%%
expr :expr '+'{$$ = $1 + $3;}
%%
#include<stdio.h>
#include "lex.yy.c"
yylex()
{
int c;
c=getchar();
if(isdigit(c))
{
yylval=c-'0';
return NUMBER;
}
return c;
}
int main()
{
yyparse();
return 1;
}
int yyerror(){
return 1;}
$3 refers to the 3rd term on the right side of the rule. In
expr :expr '+'{$$ = $1 + $3;}
there are only 2 terms on the right side of the production...

Bison Grammar %type and %token

Why is it that I have to use $<nVal>4 explicitly in the below grammar snippet?
I thought the %type <nVal> expr line would remove the need so that I can simply put $4?
Is it not possible to use a different definition for expr so that I can?
%union
{
int nVal;
char *pszVal;
}
%token <nVal> tkNUMBER
%token <pszVal> tkIDENT
%type <nVal> expr
%%
for_statement : tkFOR
tkIDENT { printf( "I:%s\n", $2 ); }
tkEQUALS
expr { printf( "A:%d\n", $<nVal>4 ); } // Why not just $4?
tkTO
expr { printf( "B:%d\n", $<nVal>6 ); } // Why not just $6?
step-statement
list
next-statement;
expr : tkNUMBER { $$ = $1; }
;
Update following rici's answer. This now works a treat:
for_statement : tkFOR
tkIDENT { printf( "I:%s\n", $2 ); }
tkEQUALS
expr { printf( "A:%d\n", $5 /* $<nVal>5 */ ); }
tkTO
expr { printf( "A:%d\n", $8 /* $<nVal>8 */ ); }
step-statement
list
next-statement;
Why is it that I have to use $<nVal>4 explicitly in the below grammar snippet?
Actually, you should use $5 if you want to refer to the expr. $4 is the tkEQUALS, which has no declared type, so any use must be explicitly typed. $3 is the previous midrule action, which has no value since $$ is not assigned in that action.
By the same logic, the second expr is $8; $6 is the second midrule action, which also has no value (and no type).
See the Bison manual:
The mid-rule action itself counts as one of the components of the rule. This makes a difference when there is another action later in the same rule (and usually there is another at the end): you have to count the actions along with the symbols when working out which number n to use in $n.

Bison:syntax error at the end of parsing

Hello this is my bison grammar file for a mini-programming language:
%{
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "projectbison.tab.h"
void yyerror(char const *);
extern FILE *yyin;
extern FILE *yyout;
extern int yylval;
extern int yyparse(void);
extern int n;
int errNum = 0;
int forNum = 0;
%}
%left PLUS MINUS
%left MULT DIV MOD
%nonassoc EQUAL NEQUAL LESS GREATER LEQUAL GEQUAL
%token INTEGER BOOLEAN STRING VOID
%token ID
%token AND
%token BEGINP
%token ENDP
%token EXTERN
%token COMMA
%token EQ
%token RETURN1
%token IF1 ELSE1 WHILE1 FOR1 DO1
%token LOR LAND LNOT
%token TRUE FALSE
%token EQUAL NEQUAL LESS GREATER LEQUAL GEQUAL
%token LB1 RB1
%token LCB1 RCB1
%token SEMIC
%token NEWLINE
%token PLUS MINUS
%token MULT DIV MOD
%token DIGIT STRING1
%start program
%%
/*50*/
program : external-decl program-header defin-field command-field
;
external-decl : external-decl external-prototype
|
;
external-prototype : EXTERN prototype-func NEWLINE
;
program-header : VOID ID LB1 RB1 NEWLINE
;
defin-field : defin-field definition
|
;
definition : variable-defin
| func-defin
| prototype-func
;
variable-defin : data-type var-list SEMIC newline
;
data-type : INTEGER
| BOOLEAN
| STRING
;
var-list : ID extra-ids
;
extra-ids : COMMA var-list
|
;
func-defin : func-header defin-field command-field
;
prototype-func : func-header SEMIC
;
func-header : data-type ID LB1 lists RB1 newline
;
lists: list-typ-param
|
;
list-typ-param : typical-param typical-params
;
typical-params : COMMA list-typ-param
|
;
typical-param : data-type AND ID
;
command-field : BEGINP commands newline ENDP newline
;
commands : commands newline command
|
;
command : simple-command SEMIC
| struct-command
| complex-command
;
complex-command : LCB1 newline command newline RCB1
;
struct-command : if-command
| while-command
| for-command
;
simple-command : assign
| func-call
| return-command
| null-command
;
if-command : IF1 LB1 gen-expr RB1 newline command else-clause
;
else-clause: ELSE1 newline command
;
while-command : WHILE1 LB1 gen-expr RB1 DO1 newline RCB1 command LCB1
;
for-command : FOR1 LB1 conditions RB1 newline RCB1 command LCB1
;
conditions : condition SEMIC condition SEMIC condition SEMIC
;
condition : gen-expr
|
;
assign : ID EQ gen-expr
;
func-call : ID LB1 real-params-list RB1
| ID LB1 RB1
;
real-params-list : real-param real-params
;
real-params : COMMA real-param real-params
|
;
real-param : gen-expr
;
return-command : RETURN1 gen-expr
;
null-command :
;
gen-expr : gen-terms gen-term
;
gen-terms : gen-expr LOR
|
;
gen-term : gen-factors gen-factor
;
gen-factors : gen-term LAND
|
;
gen-factor : LNOT first-gen-factor
| first-gen-factor
;
first-gen-factor : simple-expr comparison
| simple-expr
;
comparison : compare-operator simple-expr
;
compare-operator : EQUAL
| NEQUAL
| LESS
| GREATER
| LEQUAL
| GEQUAL
;
simple-expr : expresion simple-term
;
expresion : simple-expr PLUS
|simple-expr MINUS
|
;
simple-term : mul-expr simple-parag
;
mul-expr: simple-term MULT
| simple-term DIV
| simple-term MOD
|
;
simple-parag : simple-prot-oros
| MINUS simple-prot-oros
;
simple-prot-oros : ID
| constant
| func-call
| LB1 gen-expr RB1
;
constant : DIGIT
| STRING1
| TRUE
| FALSE
;
newline:NEWLINE
|
;
%%
void yyerror(char const *msg)
{
errNum++;
fprintf(stderr, "%s\n", msg);
}
int main(int argc, char **argv)
{
++argv;
--argc;
if ( argc > 0 )
{yyin= fopen( argv[0], "r" ); }
else
{yyin = stdin;
yyout = fopen ( "output", "w" );}
int a = yyparse();
if(a==0)
{printf("Done parsing\n");}
else
{printf("Yparxei lathos sti grammi: %d\n", n);}
printf("Estimated number of errors: %d\n", errNum);
return 0;
}
for a simple input like this :
void main()
integer k;
boolean l;
begin
aek=32;
end
i get the following :
$ ./MyParser.exe file2.txt
void , id ,left bracket , right bracket
integer , id ,semicolon
boolean , id ,semicolon
BEGIN PROGRAM
id ,equals , digit ,semicolon
END PROGRAM
syntax error
Yparxei lathos sti grammi: 8
Estimated number of errors: 1
And whatever change i make to the input file i get a syntax error at the end....Why do i get this and what can i do??thanks a lot in advance!here is the flex file just in case someone needs it :
%{
#include "projectbison.tab.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int n=1;
%}
%option noyywrap
digit [0-9]+
id [a-zA-Z][a-zA-Z0-9]*
%%
"(" {printf("left bracket , "); return LB1;}
")" {printf("right bracket\n"); return RB1;}
"{" {printf("left curly bracket , "); return LCB1;}
"}" {printf("right curly bracket\n"); return RCB1;}
"==" {printf("isotita ,"); return EQUAL;}
"!=" {printf("diafora ,"); return NEQUAL;}
"<" {printf("less_than ,"); return LESS;}
">" {printf("greater_than ,"); return GREATER;}
"<=" {printf("less_eq ,"); return LEQUAL;}
">=" {printf("greater_eq ,"); return GEQUAL;}
"||" {printf("lor\n"); return LOR;}
"&&" {printf("land\n"); return LAND;}
"&" {printf("and ,"); return AND;}
"!" {printf("lnot ,"); return LNOT;}
"+" {printf("plus ,"); return PLUS; }
"-" {printf("minus ,"); return MINUS;}
"*" {printf("multiply ,"); return MULT;}
"/" {printf("division ,"); return DIV;}
"%" {printf("mod ,"); return MOD;}
";" {printf("semicolon \n"); return SEMIC;}
"=" {printf("equals , "); return EQ;}
"," {printf("comma ,"); return COMMA;}
"\n" {n++; return NEWLINE;}
void {printf("void ,"); return VOID;}
return {printf("return ,"); return RETURN1;}
extern {printf("extern\n"); return EXTERN;}
integer {printf("integer ,"); return INTEGER;}
boolean {printf("boolean ,"); return BOOLEAN;}
string {printf("string ,"); return STRING;}
begin {printf("BEGIN PROGRAM\n"); return BEGINP;}
end {printf("END PROGRAM\n"); return ENDP;}
for {printf("for\n"); return FOR1;}
true {printf("true ,"); return TRUE;}
false {printf("false ,"); return FALSE;}
if {printf("if\n"); return IF1; }
else {printf("else\n"); return ELSE1; }
while {printf("while\n"); return WHILE1;}
{id} {printf("id ,"); return ID;}
{digit} {printf("digit ,"); return DIGIT;}
[a-zA-Z0-9]+ {return STRING1;}
` {/*catchcall*/ printf("Mystery character %s\n", yytext); }
<<EOF>> { static int once = 0; return once++ ? 0 : '\n'; }
%%
Your scanner pretty well guarantees that two newline characters will be sent at the end of the input: one from the newline present in the input, and another one as a result of your trapping <<EOF>>. However, your grammar doesn't appear to accept unexpected newlines, so the second newline will trigger a syntax error.
The simplest solution would be to remove the <<EOF>> rule, since text files without a terminating newline are very rare, and it is entirely legitimate to consider them syntax errors. A more general solution would be to allow any number of newline characters to appear where a newline is expected, by defining something like:
newlines: '\n' | newlines '\n';
(Using actual characters for single-character tokens makes your grammar much more readable, and simplifies your scanner. But that's a side issue.)
You might also ask yourself whether you really need to enforce newline terminators, since your grammar seems to use ; as a statement terminator, making the newline redundant (aside from stylistic considerations). Removing newlines from the grammar (and ignoring them, as with other whitespace, in the scanner) will also simplify your code.