I am writing a YACC program defining the CFG for vowels in the given string, My code attempt is as follows
%{
#include <stdio.h>
%}
%union{
char c;
}
%token <c> VOW
%%
cha : 'a' { printf("a\n"); }
| 'e' {printf("e\n");}
| 'i' {printf("i\n");}
| 'o' {printf("o\n");}
| 'u' {printf("u\n");}
;
%%
int main(void) {return yyparse();}
int yylex(void) {return getchar();}
void yyerror(char *s) {fprintf(stderr, "%s\n",s);}
Is this a correct definition of a CFG for vowels
You don't need a context-free grammar for your problem, only a regular expresion. You're using the wrong tool for the job. It is three lines in flex(1):
%%
[aeiou] printf("%\n", yytext);
.|\n ;
Related
I am trying to create a calculator by using lex and yacc. However I can not understand how can I give operator precedence to this program? I could not find any information about it. Which code do I need to add to my project to calculate correctly?
Yacc file is:
%{
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
int yylex();
void yyerror(const char *s);
%}
%token INTEGER
%left '*' '/'
%left '+' '-'
%%
program:
program line | line
line:
expr ';' { printf("%d\n",$1); } ; | '\n'
expr:
expr '+' term { $$ = $1 + $3; }
| expr '-' term { $$ = $1 - $3; }
| expr '*' term { $$ = $1 * $3; }
| expr '/' term { $$ = $1 / $3; }
| expr '%' term { $$ = $1 % $3; }
| expr '^' term { $$ = $1 ; }
| term { $$ = $1; }
term:
INTEGER { $$ = $1; }
%%
void yyerror(const char *s) { fprintf(stderr,"%s\n",s); return ; }
int main(void) { /*yydebug=1;*/ yyparse(); return 0; }
Lex file is:
%{
#include <stdlib.h>
#include <stdio.h>
void yyerror(char*);
extern int yylval;
#include "calc.tab.h"
#include<time.h>
%}
%%
[ \t]+ ; //skip whitespace
[0-9]+ {yylval = atoi(yytext); return INTEGER;}
[-+*/%^] {return *yytext;}
\n {return *yytext;}
; {return *yytext;}
. {char msg[25]; sprintf(msg,"%s <%s>","invalid character",yytext); yyerror(msg);}
%left '*' '/'
%left '+' '-'
Precedence declarations are specified in the order from lowest precedence to highest. So in the above code you give * and / the lowest precedence level and + and - the highest. That's the opposite order of what you want, so you'll need to switch the order of these two lines. You'll also want to add the operators % and ^, which are currently part of your grammar, but not your precedence annotations.
With those changes, you'll now have specified the precedence you want, but it won't take effect yet. Why not? Because precedence annotations are used to resolve ambiguities, but your grammar isn't actually ambiguous.
The way you've written the grammar, with only the left operand of all operators being expr and the right operand being term, there's only one way to derive an expression like 2+4*2, namely by deriving 2+4 from expr and 2 from term (because deriving 4*2 from term would be impossible since term can only match a single number). So your grammar treats all operators as left-associative and having the same precedence and your precedence annotations aren't considered at all.
In order for the precedence annotations to be considered, you'll have to change your grammar, so that both operands of the operators are expr (e.g. expr '+' expr instead of expr '+' term). Written like that an expression like 2+4*2 could either be derived by deriving 2+4 from expr as the left operand and 2 from expr as the right operand or 2 as the left and 4*2 as the right and this ambiguity will be resolved using your precedence annotations.
I cannot figure out why I am getting these results.
++
+add
+syntax error 2
++
+add
+syntax error 4
The ++ is my input and lex echoes each character and yacc prints add whenever it gets a +. It's giving me this error on every other + it gets. Doesn't matter how I give the input, I get the same results if I hit enter on every +.
lex
%{
#include "y.tab.h"
int chars = 0;
%}
%%
"+" {ECHO; chars++; return ADD;}
. {ECHO; chars++;}
\n {ECHO;}
%%
yacc
%{
#include <stdio.h>
extern int chars;
void yyerror (const char *str) {
printf ("%s %d\n", str, chars);
}
%}
%token ADD
%%
symbol : ADD {printf ("add\n");}
;
%%
int main () {
while (1) {
yyparse ();
}
}
Your grammar only accepts a 'sentence' that consists of a single token, +. When you type a second +, you induce a syntax error; your grammar doesn't allow ADD followed by ADD. Your next token after the + must be EOF for the grammar to accept your input. (Because of the . and \n rules, you can type all sorts of other stuff at the code, but there can only be one + in the input.)
I wrote the following code as a part of my yacc file.
%{
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
FILE *fp;
%}
%token LINE CIRCLE POLYGON
%token CENTRE RADIUS WITHIN
%token END
%union
{
char *string;
int number;
}
%token <number> NUM
%token <string> CORDINATE
%start Input
%%
Input:
| Input Statement
;
Statement :
END
| LINE CORDINATE CORDINATE END {fprintf(fp,"\n\\newline\n\\psline%s%s\n",$2,$3,$2,$3);}
| SCirc END
| POLYGON Mcords {fprintf(fp,"\n\\newline\n\\pspolygon%s",$2);}
;
SCirc :
CIRCLE RADIUS NUM CENTRE CORDINATE {fprintf(fp,"\n\\newline\n\\pscircle%s{%d}\n",3*$3,3*$3,$5,$3);}
| CIRCLE CENTRE CORDINATE RADIUS NUM {fprintf(fp,"\n\\newline\n\\pscircle%s{%d}\n",-2*$5,-2*$5,2*$5,2*$5,$3,$5);}
;
Mcord :
CORDINATE CORDINATE CORDINATE {$$ = strcat(strcat($1,$2),$3);}
| Mcord CODINATE {$$ = strcat($1,$2); }
;
%%
int yyerror(char *s) {
printf("%s\n",s);
}
int main(void) {
/* some stuff */
yyparse();
fprintf(fp,"\\end{pspicture}\n\\end{document}");
fclose(fp);
}
and i end up getting an error as
parser.y:41.42-43: $$ of `Mcord' has no declared type
I mean, the following example works correctly where $$ ends up as a number
Expression :
Number {$$ = $1;}
| Expression '+' Expression {$$ = $1+$2;}
I want Mcord to be as a concatinaiton of many CORDINATE.
How do I do that?
Is there any way of defining type for rules too?
Yes, nonterminal symbols have to be declared has having a type using %type <...> rather than %token <...>. Do you not have a good reference manual for Yacc? The GNU Bison manual is quite good, even if you're using some other Yacc.
curs.l :
%{
#include <stdlib.h>
#include "tree.c"
#include "yycurs.h"
%}
L [a-zA-Z_]
D [0-9]
D4 [0-3]
IDENTIFIER ({L})({L}|{D})*
INT4 {D4}+'q'
INT {D}+
%%
{IDENTIFIER} {return VARIABLE;}
%%
int yywrap(void){
return 0;
}
curs.y:
%{
#include stdio.h
void yyerror(char*);
int yylex(void);
%}
%token VARIABLE INTEGER
%%
var: VARIABLE {printf($1);};
%%
void yyerror(char *s){
fprintf(stderr, "11\n");
fprintf(stderr, "%s\n", s);
}
int main(void){
yyparse();
return 0;
}
when i run my compiled progrum, i have such result:
./curs
ff //I introduced
//result
ff //I introduced
11 //result
syntax error //result
evgeniy#evgeniy-desktop:~/documents/compilers$
Can anybody explain me, why there appears 'syntax error'?
Thanks in advance.
Your grammar defiles that a valid file consists of exactly one VARIABLE. To have more then one, you need to introduce a recursive rule.
%start vars
%%
var: VARIABLE {printf($1);};
vars: var
| vars var;
%%
I'm new to bison and I'm getting a "conflicts: 1 shift/reduce" error. Can anyone shed some light on this?
Here's the y file.
test.y:
%{
#include <stdio.h>
#include <string.h>
#define YYERROR_VERBOSE
#define YYDEBUG 1
void yyerror(const char *str);
int yywrap();
%}
%union
{
int integer;
char *string;
}
%token <string> VAR_LOCAL
%token <integer> LIT_NUMBER
%token <string> LIT_STRING
%token WS_LINEBRK
//%token SYMB_EQL
%token SYMB_PLUS
%token SYMB_MINUS
%token SYMB_MUL
%token SYMB_DIV
%%
/*
// Sample input
num = 10
str = "this is a string"
*/
inputs: /* empty token */
| literal
| variable
| inputs stmt WS_LINEBRK
;
stmt: variable "=" exps
;
exps: variable op literal
| variable op variable
| literal op literal
| literal op variable
;
op: SYMB_PLUS | SYMB_MINUS | SYMB_MUL | SYMB_DIV ;
variable: VAR_LOCAL
{
printf("variable: %s\n", $1);
}
;
literal:
number | string
;
string: LIT_STRING
{
printf("word: %s\n", $1);
}
;
number: LIT_NUMBER
{
printf("number: %d\n", $1);
}
;
%%
void yyerror(const char *str)
{
fprintf(stderr,"error: %s\n",str);
}
int yywrap()
{
return 1;
}
main()
{
yyparse();
}
Here's the lex file
test.l:
%{
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
int line_no = 0;
%}
%%
[a-z][a-zA-Z0-9]* {
// local variable
yylval.string=strdup(yytext);
return VAR_LOCAL;
}
[0-9]+ {
//number literal
yylval.integer=atoi(yytext);
return LIT_NUMBER;
}
= return SYMB_EQL;
\+ return SYMB_PLUS;
\- return SYMB_MINUS;
\* return SYMB_MUL;
\/ return SYMB_DIV;
\"[-+\!\.a-zA-Z0-9' ]+\" {
// word literal
yylval.string=strdup(yytext);
return LIT_STRING;
}
\n {
// line break
printf("\n");
return WS_LINEBRK;
}
[ \t]+ /* ignore whitespace */;
%%
bison -r test.y will write a file test.output with a detailed description of the generated state machine that allows you to see what's going on - such as the state where the shift/reduce conflict occurs.
In your case, the problem is in the start state (corresponding to your start nonterminal, inputs). Say the first token is VAR_LOCAL. There's two things your parser could do:
It could match the variable case.
It could also match the inputs stmt WS_LINEBRK case: inputs matches the empty string (first line), and stmt matches variable "=" exps.
With the one token of lookahead that bison parsers use, there's no way to tell. You need to change your grammar to get rid of this case.
To fix the grammar, as Fabian has suggested, move the variable and literal to the end of exps from inputs
inputs:
| variable
| literal
exps:
...
| variable
| literal
That allows x= y,x="aliteral" syntax.
To allow for empty input lines, change the /* empty token */ rule to WS_LINEBREAK:
inputs: WS_LINEBRK
| stmt WS_LINEBRK
| inputs stmt WS_LINEBRK
;
On another note, since the scanner still looks for the SYMB_ EQUAL ; but the parser no longer defines it (its commented out), something needs to be done in order to compile. One option is to uncomment the %token definition and use SYMB_ EQUAL instead of the literal "=" in the parser .y file.