yacc shift/reduce conflict. It really serious complexity - conflict

I was trying many many time to solve this conflict.
But I don't know why occur conflict here.
2 conflicts occur at compliation time.
yacc(bison) error goes:
State 314 conflicts: 1 shift/reduce
State 315 conflicts: 1 shift/reduce
state 314
7 c_complex_object_id: type_identifier .
8 | type_identifier . V_LOCAL_TERM_CODE_REF
V_LOCAL_TERM_CODE_REF shift, and go to state 77
V_LOCAL_TERM_CODE_REF [reduce using rule 7 (c_complex_object_id)]
$default reduce using rule 7 (c_complex_object_id)
state 315
127 c_integer_spec: integer_value .
184 ordinal: integer_value . SYM_INTERVAL_DELIM V_QUALIFIED_TERM_CODE_REF
201 integer_list_value: integer_value . ',' integer_value
203 | integer_value . ',' SYM_LIST_CONTINUE
SYM_INTERVAL_DELIM shift, and go to state 380
',' shift, and go to state 200
SYM_INTERVAL_DELIM [reduce using rule 127 (c_integer_spec)]
$default reduce using rule 127 (c_integer_spec)
state 77
8 c_complex_object_id: type_identifier V_LOCAL_TERM_CODE_REF .
$default reduce using rule 8 (c_complex_object_id)
state 380
184 ordinal: integer_value SYM_INTERVAL_DELIM . V_QUALIFIED_TERM_CODE_REF
V_QUALIFIED_TERM_CODE_REF shift, and go to state 422
state 200
201 integer_list_value: integer_value ',' . integer_value
203 | integer_value ',' . SYM_LIST_CONTINUE
V_INTEGER shift, and go to state 2
SYM_LIST_CONTINUE shift, and go to state 276
'+' shift, and go to state 170
'-' shift, and go to state 171
integer_value go to state 277
...
yacc source goes:
c_complex_object_id
: type_identifier
| type_identifier V_LOCAL_TERM_CODE_REF
;
type_identifier
: '(' V_TYPE_IDENTIFIER ')'
| '(' V_GENERIC_TYPE_IDENTIFIER ')'
| V_TYPE_IDENTIFIER
| V_GENERIC_TYPE_IDENTIFIER
;
c_integer_spec
: integer_value
| integer_list_value
| integer_interval_value
;
c_integer
: c_integer_spec
| c_integer_spec ';' integer_value
| c_integer_spec ';' error
;
ordinal
: integer_value SYM_INTERVAL_DELIM V_QUALIFIED_TERM_CODE_REF
;
integer_list_value
: integer_value ',' integer_value
| integer_value ',' SYM_LIST_CONTINUE
;
integer_value
: V_INTEGER
| '+' V_INTEGER
| '-' V_INTEGER
;
I have two problems above. What's wrong with it?

Let's consider the messages from the first shift/reduce conflict. You can read the period (".") as a pointer. What the message says, more or less in English, is
"When I'm in state 299, and I have recognized a type_identifier, I must decide whether to reduce by rule 7 (recognize c_complex_object_id : type_identifier) or to shift to state 63 (continue scanning for a V_LOCAL_TERM_CODE_REF)."
Usually a conflict like this comes about when the production not yet recognized (V_LOCAL_TERM_CODE_REF) is optional.
Your definition of the tokens V_LOCAL_TERM_CODE_REF, etc. looks OK as far as I can tell from your comment.
It's hard to diagnose this further without seeing the yacc diagnostic output for state 63. Could you edit your question to show the output for state 63? It might tell us something.
I found some lecture notes by Pete Jinks that might be useful background for you. You might also read some of the other questions listed in the right column of this page, under the "Related" heading.
Update
In one way, you are correct: a shift/reduce conflict can be ignored. bison/yacc will produce a parser that runs, that does something. But it is important to understand why you are ignoring a specific conflict. Then you will understand why the parser, when presented with an input program, parses it the way it does and produces the output that it does. It is not good to say, "oh, this is too complex, I can't figure it out."

Related

Yacc conflict i cant fix

i've been trying to fix a shift/reduce conflict in my yacc specification and i can't seem to find where it is.
%union{
char* valueBase;
char* correspondencia;
}
%token pal palT palC
%type <valueBase> pal
%type <correspondencia> palT palC Smth
%%
Dicionario : Traducao
| Dicionario Traducao
;
Traducao : Palavra Correspondencia
;
Palavra : Base Delim
| Exp
;
Delim :
| ':'
;
Correspondencia :
| palC {printf("PT Tradução: %s\n",$1);}
;
Exp : Smth '-' Smth {aux = yylval.valueBase; printf("PT Tradução: %s %s %s\n", $1, aux, $3);}
;
Smth : palT {$$ = strdup($1);}
| {$$ = "";}
;
Base : pal {printf("EN Palavra base: %s\n",$1);}
;
Any help to find and fix this conflict would be extremely appreciated.
So looking at the y.output file from your grammar, you have a shift/reduce conflict in state 13:
State 13
10 Exp: Smth '-' . Smth
palT shift, and go to state 2
palT [reduce using rule 12 (Smth)]
$default reduce using rule 12 (Smth)
Smth go to state 16
Basically, what this is saying is that when parsing an Exp after having seen a Smth '-' and looking at a lookahead of palT, it doesn't know whether it should reduce an empty Smth to finish the Exp (leaving the palT as part of some later construct) OR shift the palT so it can then be reduced (recognized) as a Smth that completes this Exp.
The language you are recognizing is a sequence of one or more Traducao, each of which consists of a Palavra followed by an optional palC (Correspondencia that may be a palC or empty). That means that you might have a Palavra directly following another Palavra (the Correspondencia for the first one is empty). So the parser needs to find the boundary between one Palavra and the next just by looking at its current state and one token of lookahead, which is a problem.
In particular, when you have an input like PalT '-' PalT '-' PalT, that is two consecutive Palavra, but it is not clear whether the middle PalT belongs to the first one or the second. It is ambiguous, because it could be parsed successfully either way.
If you want the parser to just accept as much as possible into the first Palavra, then you can just accept the default resolution (of shift). If that is wrong and you would want the other interpretation, then you are going to need more lookahead to recognize this case, as it depends on whether or not there is a second '-' after the second palT or something else.

A yacc reduce/reduce conflict I can't explain

I'm getting a shift/reduce and reduce/reduce conflict that I believe shouldn't happen. Obviously I'm doing something wrong, so someone explain to me what I'm missing.
My stripped down grammar:
/*
* Test SQL Grammar
*/
%{
#include <stdio.h>
#include <string.h>
%}
/* Yacc's YYSTYPE UNION */
%union {
char* str; /* Pointer to constant string (malloc'd in lex) */
}
%token SELECT FROM AS ROWID ROWNUM NEXTVAL CURRVAL NULL
%token <str> IDENTIFIER STRING NUMBER
%%
query_block
: SELECT
select_list
FROM row_source_list
;
select_list
: '*'
| select_item_list
;
select_item_list
: select_item_list ',' select_item
| select_item
;
select_item
: row_source '.' '*'
| expr
| expr IDENTIFIER
;
row_source_list
: row_source_list ',' row_source
| row_source
;
row_source
: IDENTIFIER
| IDENTIFIER '.' IDENTIFIER
| IDENTIFIER opt_AS IDENTIFIER
| IDENTIFIER '.' IDENTIFIER opt_AS IDENTIFIER
;
opt_AS
: /* Empty */
| AS
;
expr
: IDENTIFIER '.' IDENTIFIER
| IDENTIFIER '.' ROWID
| IDENTIFIER '.' IDENTIFIER '.' IDENTIFIER
| IDENTIFIER '.' IDENTIFIER '.' ROWID
| ROWNUM
| ROWID
| STRING
| NUMBER
| IDENTIFIER '.' CURRVAL
| IDENTIFIER '.' NEXTVAL
| NULL
;
The conflicts seem to arrise because yacc doesn't know if it is working on the select_list (expr list) or the row_source_list. State 26 of y.output details the conflict:
state 26
12 row_source: IDENTIFIER '.' IDENTIFIER .
14 | IDENTIFIER '.' IDENTIFIER . opt_AS IDENTIFIER
17 expr: IDENTIFIER '.' IDENTIFIER .
19 | IDENTIFIER '.' IDENTIFIER . '.' IDENTIFIER
20 | IDENTIFIER '.' IDENTIFIER . '.' ROWID
AS shift, and go to state 16
'.' shift, and go to state 33
IDENTIFIER reduce using rule 15 (opt_AS)
IDENTIFIER [reduce using rule 17 (expr)]
'.' [reduce using rule 12 (row_source)]
$default reduce using rule 17 (expr)
opt_AS go to state 34
Now the basic rule for "query_block" states that a row_source_list must be preceded by the "FROM" keyword, so I don't see why yacc is combining the two into one state.
query_block
: SELECT
select_list
FROM row_source_list
;
I've traced the states and it ends up in this state before finding the "FROM" keyword.
I don't understand why it is considering the row_source_list before it recognized "FROM".
(I flagged this as "no longer reproducible fault" as the OP had solved it trivially, but the flag was aged/timed out).
As it has an answer I'll transcribe the answer so at least the question is noted as answered.
As the OP states:
I found it right after posting. It's the first line in the select_item rule. I should have caught that earlier.
Which to clarify, the select_item rule should be:
select_item
: expr
| expr IDENTIFIER
;
Which removes the ambiguity.

A yacc shift/reduce conflict on an unambiguous grammar

A piece of code of my gramamar its driveing me crazy.
I have to write a grammar that allow write functions with multiple inputs
e.g.
function
begin
a:
<statments>
b:
<statements>
end
The problem with that its that is statements that are assignments like this
ID = Expresion.
in the following quote you can see the output produced by yacc.
0 $accept : InstanciasFuncion $end
1 InstanciasFuncion : InstanciasFuncion InstanciaFuncion
2 | InstanciaFuncion
3 InstanciaFuncion : PuntoEntrada Sentencias
4 PuntoEntrada : ID ':'
5 Sentencias : Sentencias Sentencia
6 | Sentencia
7 Sentencia : ID '=' ID
State 0
0 $accept: . InstanciasFuncion $end
ID shift, and go to state 1
InstanciasFuncion go to state 2
InstanciaFuncion go to state 3
PuntoEntrada go to state 4
State 1
4 PuntoEntrada: ID . ':'
':' shift, and go to state 5
State 2
0 $accept: InstanciasFuncion . $end
1 InstanciasFuncion: InstanciasFuncion . InstanciaFuncion
$end shift, and go to state 6
ID shift, and go to state 1
InstanciaFuncion go to state 7
PuntoEntrada go to state 4
State 3
2 InstanciasFuncion: InstanciaFuncion .
$default reduce using rule 2 (InstanciasFuncion)
State 4
3 InstanciaFuncion: PuntoEntrada . Sentencias
ID shift, and go to state 8
Sentencias go to state 9
Sentencia go to state 10
State 5
4 PuntoEntrada: ID ':' .
$default reduce using rule 4 (PuntoEntrada)
State 6
0 $accept: InstanciasFuncion $end .
$default accept
State 7
1 InstanciasFuncion: InstanciasFuncion InstanciaFuncion .
$default reduce using rule 1 (InstanciasFuncion)
State 8
7 Sentencia: ID . '=' ID
'=' shift, and go to state 11
State 9
3 InstanciaFuncion: PuntoEntrada Sentencias .
5 Sentencias: Sentencias . Sentencia
ID shift, and go to state 8
ID [reduce using rule 3 (InstanciaFuncion)]
$default reduce using rule 3 (InstanciaFuncion)
Sentencia go to state 12
State 10
6 Sentencias: Sentencia .
$default reduce using rule 6 (Sentencias)
State 11
7 Sentencia: ID '=' . ID
ID shift, and go to state 13
State 12
5 Sentencias: Sentencias Sentencia .
$default reduce using rule 5 (Sentencias)
State 13
7 Sentencia: ID '=' ID .
$default reduce using rule 7 (Sentencia)
Maybe somebody can help me to disambiguate this grammar
Bison provides you with at least a hint. In State 9, which is really the only relevant part of the output other than the grammar itself, we see:
State 9
3 InstanciaFuncion: PuntoEntrada Sentencias .
5 Sentencias: Sentencias . Sentencia
ID shift, and go to state 8
ID [reduce using rule 3 (InstanciaFuncion)]
$default reduce using rule 3 (InstanciaFuncion)
Sentencia go to state 12
There's a shift/reduce conflict with ID, in the context in which the possibilities are:
Complete the parse of an InstanciaFuncion (reduce)
Continue the parse of a Sentencias (shift)
In both of those contexts, an ID is possible. It's easy to construct an example. Consider these two instancias:
f : a = b c = d ...
f : a = b c : d = ...
We've finished with the b and c is the lookahead, so we can't see the symbol which follows the c. Now, have we finished parsing the funcion f? Or should we try for a longer list of sentencias? No se sabe. (Nobody knows.)
Yes, your grammar is unambiguous, so it doesn't need to be disambiguated. It's not LR(1), though: you cannot tell what to do by only looking at the next one symbol. However, it is LR(2), and there is a proof than any LR(2) grammar has a corresponding LR(1) grammar. (For any value of 2 :) ). But, unfortunately, actually doing the transformation is not always very pretty. It can be done mechanically, but the resulting grammar can be hard to read. (See Notes below for references.)
In your case, it's pretty easy to find an equivalent grammar, but the parse tree will need to be adjusted. Here's one example:
InstanciasFuncion : PuntoEntrada
| InstanciasFuncion PuntoEntrada
| InstanciasFuncion Sentencia
PuntoEntrada: ID ':' Sentencia
Sentencia : ID '=' ID
It's a curious fact that this precise shift/reduce conflict is a feature of the grammar of bison itself, since bison accepts grammars as written above (i.e. without semi-colons). Posix insists that yacc do so, and bison tries to emulate yacc. Bison itself solves this problem in the scanner, not in the grammar: it's scanner recognizes "ID :" as a single token (even if separated with arbitrary whitespace). That might also be your best bet.
There is an excellent description of the proof than any LR(k) grammar can be covered by an LR(1) grammar, including the construction technique and a brief description of how to recover the original parse tree, in Sippu & Soisalon-Soininen, Parsing Theory, Vol. II (Springer Verlag, 1990) (Amazon). This two-volume set is a great reference for theoreticians, and has a lot of valuable practical information, but its heavy reading and its also a serious investment. If you have a university library handy, there should be a copy of it available. The algorithm presented is due to MD Mickunas, and was published in 1976 in JACM 23:17-30 (paywalled), which you should also be able to find in a good university library. Failing that, I found a very abbreviated description in Richard Marion Schell's thesis.
Personally, I wouldn't bother with all that, though. Either use a GLR parser, or use the same trick bison uses for the same purpose. Or use the simple grammar in the answer above and fiddle with the AST afterwards; it's not really difficult.

Solving YACC shift/reduce. Driving me crazy

Hey Guys this is driving me crazy I'll list the error and the relevant code below. Thanks in advance for any help.
ERROR:
51: shift/reduce conflict (shift 69, reduce 28) on '{'
state 51
funcao : publico tIDENTIFIER '(' seq_vars ')' eqliteral . corpo (13)
corpo : . (28)
'{' shift 69
$end reduce 28
tVOID reduce 28
tPUBLIC reduce 28
tCONST reduce 28
tIF reduce 28
tDO reduce 28
tFOR reduce 28
tCONTINUE reduce 28
tBREAK reduce 28
tRETURN reduce 28
tINTEGER reduce 28
tNUMBER reduce 28
tSTRING reduce 28
corpo goto 70
bloco goto 71
And this is the relevant code
// Função
funcao: publico tIDENTIFIER '(' seq_vars ')' eqliteral corpo {};
// Corpo do bloco
corpo: bloco |;
// Bloco
bloco: '{' seq_decls seq_inst '}' {/*figure this out later*/};
I'll keep trying to solve it and post the answer if I do.
Since we can't possible replicate the circumstances, I'm only guessing...
It looks like Yacc doesn't know what to do when it reaches the position after the eqliteral nonterminal. You can see that's where the parser generator is because of the . in rule in the error message.
When Yacc reaches this position, and there is no '{' terminal, should it shift using the bloco rule (you see the . in that rule too) or should it reduce when seeing something else?
One possible solution (that I'm unable to verify) is to change the funcao rule:
funcao: publico tIDENTIFIER '(' seq_vars ')' eqliteral
| publico tIDENTIFIER '(' seq_vars ')' eqliteral '{' seq_decls seq_inst '}'
;
It may work, it may not.

Why does this simple grammar have a shift/reduce conflict?

%token <token> PLUS MINUS INT
%left PLUS MINUS
THIS WORKS:
exp : exp PLUS exp;
exp : exp MINUS exp;
exp : INT;
THIS HAS 2 SHIFT/REDUCE CONFLICTS:
exp : exp binaryop exp;
exp : INT;
binaryop: PLUS | MINUS ;
WHY?
This is because the second is in fact ambiguous. So is the first grammar, but you resolved the ambiguity by adding %left.
This %left does not work in the second grammar, because associativity and precedence are not inherited from rule to rule. I.e. the binaryop nonterminal does not inherit any such thing even though it produces PLUS and MINUS. Associativity and predecence are localized to a rule, and revolve around terminal symbols.
We cannot do %left binaryop, but we can slightly refactor the grammar:
exp : exp binaryop term
exp : term;
term : INT;
binaryop: PLUS | MINUS ;
That has no conflicts now because it is implicitly left-associative. I.e. the production of a longer and longer expression can only happen on the left side of the binaryop, because the right side is a term which produces only an INT.
You need to specify a precedence for the exp binop exp rule if you want the precedence rules to resolve the ambiguity:
exp : exp binaryop exp %prec PLUS;
With that change, all the conflicts are resolved.
Edit
The comments seem to indicate some confusion as to what the precedence rules in yacc/bison do.
The precedence rules are a way of semi-automatically resolving shift/reduce conflicts in the grammar. They're only semi-automatic in that you have to know what you are doing when you specify the precedences.
Bascially, whenever there is a shift/reduce conflict between a token to be shifted and a rule to be reduced, yacc compares the precedence of the token to be shifted and the rule to be reduced, and -- as long as both have assigned precedences -- does whichever is higher precedence. If either the token or the rule has no precedence assigned, then the conflict is reported to the user.
%left/%right/%nonassoc come into the picture when the token and rule have the SAME precedence. In that case %left means do the reduce, %right means do the shift, and %nonassoc means do neither, causing a syntax error at runtime if the parser runs into this case.
The precedence levels themselves are assigned to tokens with%left/%right/%nonassoc and to rules with %prec. The only oddness is that rules with no %prec and at least one terminal on the RHS get the precedence of the last terminal on the RHS. This can sometimes end up assigning precedences to rules that you really don't want to have precedence, which can sometimes result in hiding conflicts due to resolving them incorrectly. You can avoid these problems by adding an extra level of indirection in the rule in question -- change the problematic terminal on the RHS to to a new non-terminal that expands to just that terminal.
I assume that this falls under what the Bison manual calls "Mysterious Conflicts". You can replicate that with:
exp: exp plus exp;
exp: exp minus exp;
exp: INT;
plus: PLUS;
minus: MINUS;
which gives four S/R conflicts for me.
The output file describing the conflicted grammar produced by Bison (version 2.3) on Linux is as follows. The key information at the top is 'State 7 has conflicts'.
State 7 conflicts: 2 shift/reduce
Grammar
0 $accept: exp $end
1 exp: exp binaryop exp
2 | INT
3 binaryop: PLUS
4 | MINUS
Terminals, with rules where they appear
$end (0) 0
error (256)
PLUS (258) 3
MINUS (259) 4
INT (260) 2
Nonterminals, with rules where they appear
$accept (6)
on left: 0
exp (7)
on left: 1 2, on right: 0 1
binaryop (8)
on left: 3 4, on right: 1
state 0
0 $accept: . exp $end
INT shift, and go to state 1
exp go to state 2
state 1
2 exp: INT .
$default reduce using rule 2 (exp)
state 2
0 $accept: exp . $end
1 exp: exp . binaryop exp
$end shift, and go to state 3
PLUS shift, and go to state 4
MINUS shift, and go to state 5
binaryop go to state 6
state 3
0 $accept: exp $end .
$default accept
state 4
3 binaryop: PLUS .
$default reduce using rule 3 (binaryop)
state 5
4 binaryop: MINUS .
$default reduce using rule 4 (binaryop)
state 6
1 exp: exp binaryop . exp
INT shift, and go to state 1
exp go to state 7
And here is the information about 'State 7':
state 7
1 exp: exp . binaryop exp
1 | exp binaryop exp .
PLUS shift, and go to state 4
MINUS shift, and go to state 5
PLUS [reduce using rule 1 (exp)]
MINUS [reduce using rule 1 (exp)]
$default reduce using rule 1 (exp)
binaryop go to state 6
The trouble is described by the . markers in the the lines marked 1. For some reason, the %left is not 'taking effect' as you'd expect, so Bison identifies a conflict when it has read exp PLUS exp and finds a PLUS or MINUS after it. In such cases, Bison (and Yacc) do the shift rather than the reduce. In this context, that seems to me to be tantamount to giving the rules right precedence.
Changing the %left to %right and omitting it do not change the result (in terms of the conflict warnings). I also tried Yacc on Solaris and it produce essentially the same conflict.
So, why does the first grammar work? Here's the output:
Grammar
0 $accept: exp $end
1 exp: exp PLUS exp
2 | exp MINUS exp
3 | INT
Terminals, with rules where they appear
$end (0) 0
error (256)
PLUS (258) 1
MINUS (259) 2
INT (260) 3
Nonterminals, with rules where they appear
$accept (6)
on left: 0
exp (7)
on left: 1 2 3, on right: 0 1 2
state 0
0 $accept: . exp $end
INT shift, and go to state 1
exp go to state 2
state 1
3 exp: INT .
$default reduce using rule 3 (exp)
state 2
0 $accept: exp . $end
1 exp: exp . PLUS exp
2 | exp . MINUS exp
$end shift, and go to state 3
PLUS shift, and go to state 4
MINUS shift, and go to state 5
state 3
0 $accept: exp $end .
$default accept
state 4
1 exp: exp PLUS . exp
INT shift, and go to state 1
exp go to state 6
state 5
2 exp: exp MINUS . exp
INT shift, and go to state 1
exp go to state 7
state 6
1 exp: exp . PLUS exp
1 | exp PLUS exp .
2 | exp . MINUS exp
$default reduce using rule 1 (exp)
state 7
1 exp: exp . PLUS exp
2 | exp . MINUS exp
2 | exp MINUS exp .
$default reduce using rule 2 (exp)
The difference seems to be that in states 6 and 7, it is able to distinguish what to do based on what comes next.
One way of fixing the problem is:
%token <token> PLUS MINUS INT
%left PLUS MINUS
%%
exp : exp binaryop term;
exp : term;
term : INT;
binaryop: PLUS | MINUS;