Simple Lex/Yacc Calculator not printing output - yacc

I'm trying to understand how compilers and programming languages are made. And to do so I thought about creating a simple calculator which does just addition and subtraction. Below are the Lex and Yacc files which I wrote.
calc.yacc file:
%{
#include <stdio.h>
#include <stdlib.h>
extern int yylex();
void yyerror(char *);
%}
%union { int number; }
%start line
%token <number> NUM
%type <number> expression
%%
line: expression { printf("%d\n", $1); };
expression: expression '+' NUM { $$ = $1 + $3; };
expression: expression '-' NUM { $$ = $1 - $3; };
expression: NUM { $$ = $1; };
%%
void yyerror(char *s) {
fprintf(stderr, "%s", s);
exit(1);
}
int main() {
yyparse();
return 0;
}
calc.lex file:
%{
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
%}
%%
[0-9]+ {
yylval.number = atoi(yytext);
return NUM;
}
[-+] { return yytext[0]; }
[ \t\f\v\n] { ; }
%%
int yywrap() {
return 1;
}
It compiles nicely but when I run it and type something like 2 + 4 then it gets stuck and doesn't print the answer. Can somebody explain why? My guess is that my grammar is not correct (but I don't know how).

I came to the same idea like rici and changed your samples appropriately:
file calc.l:
%{
#include <stdio.h>
#include <stdlib.h>
#include "calc.y.h"
%}
%%
[0-9]+ {
yylval.number = atoi(yytext);
return NUM;
}
[-+] { return yytext[0]; }
"\n" { return EOL; }
[ \t\f\v\n] { ; }
%%
int yywrap() {
return 1;
}
file calc.y:
%{
#include <stdio.h>
#include <stdlib.h>
extern int yylex();
void yyerror(char *);
%}
%union { int number; }
%start input
%token EOL
%token <number> NUM
%type <number> expression
%%
input: line input | line
line: expression EOL { printf("%d\n", $1); };
expression: expression '+' NUM { $$ = $1 + $3; };
expression: expression '-' NUM { $$ = $1 - $3; };
expression: NUM { $$ = $1; };
%%
void yyerror(char *s) {
fprintf(stderr, "%s", s);
exit(1);
}
int main() {
yyparse();
return 0;
}
Compiled & tested in cygwin on Windows 10 (64 bit):
$ flex -o calc.l.c calc.l
$ bison -o calc.y.c -d calc.y
$ gcc -o calc calc.l.c calc.y.c
$ ./calc
2 + 4
6
2 - 4
-2
234 + 432
666
Notes:
Minor issue: According to the build commands, I had to change the #include for the generated token table. (A matter of taste.)
I introduced the EOL token in the lex source as well as in the line rule of the parser.
While testing I recognized that the 2nd input ended everytimes in a syntax error. I needed a while until I recognized that the grammer was actually limited now to accept precisely one line. Thus, I inserted the recursive input rule in the parser source.

Related

Accept both integers and floats in a bison grammar

This question is attached to this post https://stackoverflow.com/questions/42848197/bison-flex-cannot-print-out-result?noredirect=1#comment72805876_42848197
this time I try to make my calculator program accepts both integers and floats numbers.
Thank you.
Here is my code
Flex:
%{
#include <stdio.h>
#include "f1.tab.h"
%}
integer [1-9][0-9]*|0
float [0-9]+\.[0-9]+
%%
{integer} { yylval.ival = atoi(yytext); return INT; }
{float} { yylval.fval = atof(yytext); return FLOAT; }
. { return yytext[0]; }
%%
Bison :
%{
#include <stdio.h>
%}
%union {
int ival;
float fval;
}
%token <ival> INT
%token <fval> FLOAT
%type <fval> exp
%type <fval> fac
%type <fval> f
%%
input: line
| input line
;
line: exp ';' { printf("%d\n", $1); };
exp: fac { $$ = $1; }
| exp '+' fac { $$ = $1 + $3; }
| exp '-' fac { $$ = $1 - $3; }
;
fac: f
| fac '*' f { $$ = $1 * $3; }
| fac '/' f { $$ = $1 / $3; }
;
f: INT | FLOAT;
%%
main(int argc, char **argv) {
yyparse();
}
yyerror(char *s) {
fprintf(stderr, "error: %s\n", s);
}
Bison tells you exactly what the problem is:
parser.y:32.4-6: warning: type clash on default action: <fval> != <ival> [-Wother]
f: INT | FLOAT;
^^^
The default action for the rule f: INT copies an ivar to an fvar without any sort of conversion (basically, copying via union). To fix it, you need to insert a conversion:
f: INT { $$ = (double)$1; }

lex and yacc to parse trignometric expression

I have the following code for lex and yacc. I am getting kind of extra values in the printed statement can anyone tell. whats wrong with the code?
Lex code:
%{
#include <stdio.h>
#include "y.tab.h"
%}
%%
[ \t] ;
[+-] { yylval=yytext; return Sym;}
(s|c|t)..x { yylval=yytext; return Str;}
[a-zA-Z]+ { printf("Invalid");}
%%
int yywrap()
{
return 1;
}
yacc code:
%{
#include<stdio.h>
%}
%start exps
%token Sym Str
%%
exps: exps exp
| exp
;
exp : Str Sym Str {printf("%s",$1); printf("%s",$2); printf("%s",$3);}
;
%%
int main (void)
{
while(1){
return yyparse();
}
}
yyerror(char *err) {
fprintf(stderr, "%s\n",err);
}
Input:
sinx+cosx
output:
sinx+cosx+cosxcosx
look at the output of the code!!!
yytext is a pointer into flex's internal scanning buffer, so its contents will be modified when the next token is read. If you want to return it to the parser, you need to make a copy:
[+-] { yylval=strdup(yytext); return Sym;}
(s|c|t)..x { yylval=strdup(yytext); return Str;}
Where symbols are a single character, it might make more sense to return that character directly in the scanner:
[-+] { return *yytext; }
in which case, your yacc rules should use the character directly in '-single quotes:
exp : Str '+' Str {printf("%s + %s",$1, $3); free($1); free($3); }
| Str '-' Str {printf("%s - %s",$1, $3); free($1); free($3); }

Print tokens properly using Lex and Yacc

I'm having difficulties printing a sequence of tokens that behaves recursively. To better explain, I will show the sections of the corresponding codes: First, the code on Lex:
%{
#include <stdio.h>
#include "y.tab.h"
installID(){
}
%}
abreparentese "("
fechaparentese ")"
pontoevirgula ";"
virgula ","
id {letra}(({letra}|{digito})|({letra}|{digito}|{underline}))*
digito [0-9]
letra [a-z|A-Z]
porreal "%real"
portexto "%texto"
porinteiro "%inteiro"
leia "leia"
%%
{abreparentese} { return ABREPARENTESE; }
{fechaparentese} { return FECHAPARENTESE; }
{pontoevirgula} { return PONTOEVIRGULA; }
{virgula} { return VIRGULA; }
{id} { installID();
return ID; }
{porinteiro} { return PORINTEIRO; }
{porreal} { return PORREAL; }
{portexto} { return PORTEXTO; }
{leia} { return LEIA;}
%%
int yywrap() {
return 1;
}
Now, the code on Yacc:
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#define YYSTYPE char*
int yylex(void);
void yyerror(char *);
extern FILE *yyin, *yyout;
extern char* yytext;
%}
%token ABREPARENTESE FECHAPARENTESE PONTOEVIRGULA VIRGULA ID PORREAL PORTEXTO PORINTEIRO LEIA
%%
programs : programs program
| program
| ABREPARENTESE {fprintf(yyout,"%s",yytext);}
| FECHAPARENTESE {fprintf(yyout,"%s",yytext);}
;
program:
leia
;
leia:
LEIA ABREPARENTESE entradas ids FECHAPARENTESE PONTOEVIRGULA
{
fprintf(yyout,"scanf(\"%s\",%s);",$3,$4);
}
;
entradas:
tipo_entrada VIRGULA entradas {fprintf(yyout,"%s,",$1);}
| tipo_entrada VIRGULA {fprintf(yyout,"%s", $1); }
;
tipo_entrada:
| PORREAL {$$ = "%f";}
| PORTEXTO {$$ = "%c";}
| PORINTEIRO {$$ = "%d";}
;
ids:
id VIRGULA ids {fprintf(yyout,"&%s,",$1);}
| id {fprintf(yyout,"&%s",$1);}
;
id:
ID {$$ = strdup(yytext);}
;
%%
void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}
int main(int argc, char *argv[]){
yyout = fopen(argv[2],"w");
yyin = fopen(argv[1], "r");
yyparse();
return 0;
}
I believe I have copied all the relevant part of my problem on the code (some things maybe I forgot to copy and paste), however my problem is this part of the code:
leia: LEIA ABREPARENTESE entradas ids FECHAPARENTESE PONTOEVIRGULA
{
fprintf(yyout,"scanf(\"%s\",%s);",$3,$4);
}
;
In the input file, I have the following line:
leia (%real, %inteiro, id1, id2);
The expectation was this on the output file:
scanf("%f,%d",&id1,&id2);
But actually this is the result in the output file:
%d%f,&id2&id1,scanf("%f",id1);
Can you help me solve this problem? How do I print the tokens in the right place?
Normally, with bottom-up parsing, we use left-recursive productions, which has the result that the productions are reduced from left to right.
When you use right recursion, then productions are stacked up until the end, and then popped off the stack and therefore reductions are executed right-to-left.
So for example, it would be more usual to write:
ids: id
| ids ',' id
and then the semantic rules will execute in the expected order.

Syntax error in Bison after one token is processed

I am trying to come up to speed on Flex and Bison. I can parse one token with a very simple "language" but it fails on the second, even though the token is legitimate.
test.l:
%{
#include <stdio.h>
#include "test.hpp"
%}
%%
[0-9]+ {printf("Number entered\n"); return INTEGER_NUMBER;}
[a-zA-Z]+ {printf("plain text entered: '%s'\n",yytext); return PLAIN_TEXT;}
[ \t] ;
. ;
%%
test.y
%{
#include <stdio.h>
extern "C" {
int yyparse(void);
int yylex(void);
int yywrap() { return 1; }
extern int yylineno;
extern char* yytext;
extern int yylval;
}
/* #define YYSTYPE char * */
void yyerror(const char *message)
{
fprintf(stderr, "%d: error: '%s' at '%s', yylval=%u\n", yylineno, message, yytext, yylval);
}
main()
{
yyparse();
}
%}
%token PLAIN_TEXT INTEGER_NUMBER
%%
test : text | number;
text : PLAIN_TEXT
{
/*printf("plain text\n");*/
};
number : INTEGER_NUMBER
{
/*printf("number\n");*/
};
%%
Results:
$ ./test
cat
plain text entered: 'cat'
dog
plain text entered: 'dog'
1: error: 'syntax error' at 'dog', yylval=0
$ ./test
34
Number entered
34
Number entered
1: error: 'syntax error' at '34', yylval=0
Why am I getting this syntax error?
Your test.y seems to lack the grammar for the case that several tests
continue.
So, how about adding the grammar like the following?
%%
tests : test | tests test; /* added */
test : text | number;
...

correcting some simple logic errors in lex and yacc

Please i need help in solving those two simple logic errors that i am facing in my example.
Here are the details:
The Input File: (input.txt)
FirstName:James
LastName:Smith
normal text
The output File: (output.txt) - [with two logic errors]
The Name is: James
The Name is: LastName:Smith
The Name is: normal text
What I am expecting as output (instead of the above lines) - [without logical errors]
The Name is: James
The Name is: Smith
normal text
In other words, i don't want the lastName to be sent to output, and i want to match normal text as well if it is written after the "FirstName:" or "LastName:".
Here is my lex File (example.l):
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "y.tab.h"
/* prototypes */
void yyerror(const char*);
/* Variables: */
char *tempString;
%}
%START sBody
%%
"FirstName:" { BEGIN sBody; }
"LastName:" { BEGIN sBody; }
.? { return sNormalText; }
\n /* Ignore end of line */;
[ \t]+ /* Ignore whitespace */;
<sBody>.+ {
tempString = (char *)calloc(strlen(yytext)+1, sizeof(char));
strcpy(tempString, yytext);
yylval.sValue = tempString;
return sText;
}
%%
int main(int argc, char *argv[])
{
if ( argc < 3 )
{
printf("Please you need two args: inputFileName and outputFileName");
}
else
{
yyin = fopen(argv[1], "r");
yyout = fopen(argv[2], "w");
yyparse();
fclose(yyin);
fclose(yyout);
}
return 0;
}
Here is my yacc file: (example.y):
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "y.tab.h"
void yyerror(const char*);
int yywrap();
extern FILE *yyout;
%}
%union
{
int iValue;
char* sValue;
};
%token <sValue> sText
%token <sValue> sNormalText
%%
StartName: /* for empty */
| sName StartName
;
sName:
sText
{
fprintf(yyout, "The Name is: %s\n", $1);
}
|
sNormalText
{
fprintf(yyout, "%s\n", $1);
}
;
%%
void yyerror(const char *str)
{
fprintf(stderr,"error: %s\n",str);
}
int yywrap()
{
return 1;
}
Please if you can help me out correcting those simple logical errors, i will be grateful.
Thanks in advance for your help and for reading my post.
Part of the trouble is that you move into state 'sBody' but you never move back to the initial state 0.
Another problem - not yet a major one - is that you use a right-recursive grammar rule instead of the (natural for Yacc) left-recursive rule:
StartName: /* empty */
| sName StartName
;
vs
StartName: /* empty */
| StartName sName
;
Adding BEGIN 0; to the <sBody> Lex rule improves things a lot; the remaining trouble is that you get one more line 'Smith' in the output file for each single letter in the normal text. You need to review how the value is returned to your grammar.
By adding yylval.sValue = yytext; before the return in the rule that returns sNormalText, I got the 'expected' output.
example.l
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"
/* prototypes */
void yyerror(const char*);
/* Variables: */
char *tempString;
%}
%START sBody
%%
"FirstName:" { puts("FN"); BEGIN sBody; }
"LastName:" { puts("LN"); BEGIN sBody; }
.? { printf("NT: %s\n", yytext); yylval.sValue = yytext; return sNormalText; }
\n /* Ignore end of line */;
[ \t]+ /* Ignore whitespace */;
<sBody>.+ {
tempString = (char *)calloc(strlen(yytext)+1, sizeof(char));
strcpy(tempString, yytext);
yylval.sValue = tempString;
puts("SB");
BEGIN 0;
return sText;
}
%%
int main(int argc, char *argv[])
{
if ( argc < 3 )
{
printf("Please you need two args: inputFileName and outputFileName");
}
else
{
yyin = fopen(argv[1], "r");
if (yyin == 0)
{
fprintf(stderr, "failed to open %s for reading\n", argv[1]);
exit(1);
}
yyout = fopen(argv[2], "w");
if (yyout == 0)
{
fprintf(stderr, "failed to open %s for writing\n", argv[2]);
exit(1);
}
yyparse();
fclose(yyin);
fclose(yyout);
}
return 0;
}
example.y
%{
#include <stdio.h>
#include "y.tab.h"
void yyerror(const char*);
int yywrap();
extern FILE *yyout;
%}
%union
{
char* sValue;
};
%token <sValue> sText
%token <sValue> sNormalText
%%
StartName: /* for empty */
| StartName sName
;
sName:
sText
{
fprintf(yyout, "The Name is: %s\n", $1);
}
|
sNormalText
{
fprintf(yyout, "The Text is: %s\n", $1);
}
;
%%
void yyerror(const char *str)
{
fprintf(stderr,"error: %s\n",str);
}
int yywrap()
{
return 1;
}
output.txt
The Name is: James
The Name is: Smith
The Text is: n
The Text is: o
The Text is: r
The Text is: m
The Text is: a
The Text is: l
The Text is:
The Text is: t
The Text is: e
The Text is: x
The Text is: t
It might make more sense to put yywrap() in with the lexical analyzer rather than with the grammar. I've left the terse debugging prints in the code - they helped me see what was going wrong.
FN
SB
LN
SB
NT: n
NT: o
NT: r
NT: m
NT: a
NT: l
NT:
NT: t
NT: e
NT: x
NT: t
You'll need to play with the '.?' rule to get normal text returned in its entirety. You may also have to move it around the file - start states are slightly peculiar critters. When I changed the rule to '.+', Flex gave me the warning:
example.l:25: warning, rule cannot be matched
example.l:27: warning, rule cannot be matched
These lines referred to the blank/tab and sBody rules. Moving the unqualified '.+' after the sBody rule removed the warnings, but didn't seem to do what was wanted. Have fun...