ML-Yacc error concerning 12 shift/reduce conflicts involving EXP -> EXP BINOP EXP - yacc

This is the error:
12 shift/reduce conflicts
error: state 34: shift/reduce conflict (shift OR, reduce by rule 11)
error: state 34: shift/reduce conflict (shift AND, reduce by rule 11)
error: state 34: shift/reduce conflict (shift GE, reduce by rule 11)
error: state 34: shift/reduce conflict (shift GT, reduce by rule 11)
error: state 34: shift/reduce conflict (shift LE, reduce by rule 11)
error: state 34: shift/reduce conflict (shift LT, reduce by rule 11)
error: state 34: shift/reduce conflict (shift NEQ, reduce by rule 11)
error: state 34: shift/reduce conflict (shift EQ, reduce by rule 11)
error: state 34: shift/reduce conflict (shift DIVIDE, reduce by rule 11)
error: state 34: shift/reduce conflict (shift TIMES, reduce by rule 11)
error: state 34: shift/reduce conflict (shift MINUS, reduce by rule 11)
error: state 34: shift/reduce conflict (shift PLUS, reduce by rule 11)
This is the grammar:
program : exp ()
exp:
exp binop exp ()
| ID ()
| lvalue ()
| STRING ()
| INT ()
| NIL ()
| LPAREN expseq RPAREN ()
| lvalue ASSIGN exp ()
| ID LPAREN explist RPAREN ()
| LET declist IN expseq END ()
| IF exp THEN exp ELSE exp ()
| IF exp THEN exp ()
binop:
EQ ()
| NEQ ()
| LT ()
| GT ()
| LE ()
| GE ()
| AND ()
| OR ()
| PLUS ()
| MINUS ()
| TIMES ()
| DIVIDE ()
How do I solve this? Do I need to rethink the grammar and find another way to describe this grammar?
I have tried also declaring the preference order (although I have really minimal experience using these) such as:
%nonassoc OR NEQ EQ LT LE GT GE AND
%right PLUS MINUS
%right TIMES DIVIDE
but nothing.

The conflicts all come from the ambuguity of the exp: exp binop exp rule -- an input like a+b*c with two binops can be parsed as either (a+b)*c or a+(b*c).
To resolve this, the easiest way is to set precedences for the tokens AND the rules involved. You've done that for the tokens, but haven't done it for the rule exp: exp binop exp. Unfortunately, you can only set one precedence per rule, and this rule needs different precedences depending on which token matched binop. The easiest solution is to just replicate the rule and get rid of binop:
exp : exp EQ exp
| exp NEQ exp
| exp LE exp
| exp GT exp
| exp LE exp
| exp GE exp
| exp AND exp
| exp OR exp
| exp PLUS exp
| exp MINUS exp
| exp TIMES exp
| exp DIVIDE exp
Now each token has its own version of the rule, and each rule automatically gets its precedence from the single token in it, so you don't even need to explicitly set the precedence of the rules, yacc does it for you.

Related

How to use `deriving` in Idris?

I'm trying to deriving Show, Eq, Ord etc in Idris, but none of the following trials works:
trail #1:
data Expr =
Lit Int
| Neg Expr
| Add Expr Expr
deriving (Show)
got:
deriving.idr:5:15-18:
|
5 | deriving (Show)
| ~~~~
When checking type of Main.Add:
Type mismatch between
Type -> Type (Type of Show)
and
Type (Expected type)
trail #2:
data Expr =
Lit Int
| Neg Expr
| Add Expr Expr
deriving (Show _)
got:
*deriving> Lit 1
Lit 1 : Expr
*deriving> Add (Lit 1) (Lit 1)
(input):Can't infer argument ty to Add, Can't infer argument deriving to Add
trail #3:
data Expr =
Lit Int
| Neg Expr
| Add Expr Expr
deriving (Show Expr)
got:
*deriving> Lit 1
Lit 1 : Expr
*deriving> Add (Lit 1) (Lit 1)
(input):Can't infer argument deriving to Add
I have searched the keyword deriving on http://docs.idris-lang.org/ and google, and even in the idris-dev repo under test/ directory, but there is no demo for the usage of deriving in idris. Anyone can help?
You can use Stefan Hoeck's idris2-sop library to generate implementations with elaborator reflection.

Is it possible to do patern binding a la Haskell in Idris?

An example would be:
fib: Stream Integer
fib#(1::tfib) = 1 :: 1 :: [ a+b | (a,b) <- zip fib tfib]
But this generates the error:
50 | fib#(1::tfib) = 1 :: 1 :: [ a+b | (a,b) <- zip fib tfib]
| ^
unexpected "#(1::tfib)"
expecting "<==", "using", "with", ':', argument expression, constraint argument, expression, function right hand side, implementation
block, implicit function argument, or with pattern
This doesn't look promising given that it doesn't recognize # at the likely position.
Note that the related concept of as-patterns works the same in Haskell and Idris:
growHead : List a -> List a
growHead nnl#(x::_) = x::nnl
growHead ([]) = []

Is my solution correct for this context free grammar?

I'm trying to solve this problem (I think I might have solved it): http://d.pr/i/L5Qm
L = {a3nb2n | n >= 0}
Basically the problem is saying l is not equal to m or m is not equal to n
Rules that I've generated:
S -> aaaSbb | A
A -> a | ^
A few tests:
Test one: S --> aaaSbb -> aaaAbb -> aaabb
Test two: S --> A -> a
Test three: S --> A -> ^
I'm sure there's a lot more that I could have tested but I'm not quite sure how to test for the majority of problems as I'm quite new to these. I'm thankful for any help.
Your test two:
Test two: S --> A -> a
Clearly shows that your grammar is wrong!
In your language:
L = {a3nb2n | n >= 0}
You always have three a for each two b, and a is not in language L.
The correct productions for your language is:
S ---> aaaSbb | ^
Where ^ is null symbols. notice ^ is also in language because n can be 0.
Edit:
you grammar:
S -> aaaSbb | A
A -> a | ^
produce union of two languages, that is:
{a3nb2n | n >= 0} U {a3n+1b2n | n >= 0}
The extra part a3n+1b2n is due to A--> a production.

How to deal with variable references in yacc/bison (with ocaml)

I was wondering how to deal with variable references inside statements while writing grammars with ocamlyacc and ocamllex.
The problem is that statements of the form
var x = y + z
var b = true | f;
should be both correct but in the first case variable refers to numbers while in the second case f is a boolean variable.
In the grammar I'm writing I have got this:
numeric_exp_val:
| nint { Syntax.Int $1 }
| FLOAT { Syntax.Float $1 }
| LPAREN; ne = numeric_exp; RPAREN { ne }
| INCR; r = numeric_var_ref { Syntax.VarIncr (r,1) }
| DECR; r = numeric_var_ref { Syntax.VarIncr (r,-1) }
| var_ref { $1 }
;
boolean_exp_val:
| BOOL { Syntax.Bool $1 }
| LPAREN; be = boolean_exp; RPAREN { be }
| var_ref { $1 }
;
which obviously can't work, since both var_ref non terminals reduce to the same (reduce/reduce conflict). But I would like to have type checking that is mostly statically done (with respect to variable references) during the parsing phase itself.
That's why I'm wondering which is the best way to have variable references and keep this structure. Just as an additional info I have functions that compile the syntax tree by translating it into a byte code similar to this one:
let rec compile_numeric_exp exp =
match exp with
Int i -> [Push (Types.I.Int i)]
| Float f -> [Push (Types.I.Float f)]
| Bop (BNSum,e1,e2) -> (compile_numeric_exp e1) # (compile_numeric_exp e2) # [Types.I.Plus]
| Bop (BNSub,e1,e2) -> (compile_numeric_exp e1) # (compile_numeric_exp e2) # [Types.I.Minus]
| Bop (BNMul,e1,e2) -> (compile_numeric_exp e1) # (compile_numeric_exp e2) # [Types.I.Times]
| Bop (BNDiv,e1,e2) -> (compile_numeric_exp e1) # (compile_numeric_exp e2) # [Types.I.Div]
| Bop (BNOr,e1,e2) -> (compile_numeric_exp e1) # (compile_numeric_exp e2) # [Types.I.Or]
| VarRef n -> [Types.I.MemoryGet (Memory.index_for_name n)]
| VarIncr ((VarRef n) as vr,i) -> (compile_numeric_exp vr) # [Push (Types.I.Int i);Types.I.Plus;Types.I.Dupe] # (compile_assignment_to n)
| _ -> []
Parsing is simply not the right place to do type-checking. I don't understand why you insist on doing this in this pass. You would have much clearer code and greater expressive power by doing it in a separate pass.
Is it for efficiency reasons? I'm confident you could devise efficient incremental-typing routines elsewhere, to be called from the grammar production (but I'm not sure you'll win that much). This looks like premature optimization.
There has been work on writing type systems as attribute grammars (which could be seen as a declarative way to express typing derivations), but I don't think it is meant to conflate parsing and typing in a single pass.
If you really want to go further in this direction, I would advise you to use a simple lexical differentiation between num-typed and bool-typed variables. This sounds ugly but is simple.
If you want to treat numeric expressions and boolean expressions as different syntactic categories, then consider how you must parse var x = ( ( y + z ) ). You don't know which type of expression you're parsing until you hit the +. Therefore, you need to eat up several tokens before you know whether you are seeing a numeric_exp_val or a boolean_exp_val: you need some unbounded lookahead. Yacc does not provide such lookahead (Yacc only provides a restricted form of lookahead, roughly described as LALR, which puts bounds on parsing time and memory requirements). There is even an ambiguous case that makes your grammar context-sensitive: with a definition like var x = y, you need to look up the type of y.
You can solve this last ambiguity by feeding back the type information into the lexer, and you can solve the need for lookahead by using a parser generator that supports unbounded lookahead. However, both of these techniques will push your parser towards a point where it can't easily evolve if you want to expand the language later on (for example to distinguish between integer and floating-point numbers, to add strings or lists, etc.).
If you want a simple but constraining fix with a low technological overhead, I'll second gasche's suggestion of adding a syntactic distinguisher for numeric and boolean variable definitions, something like bvar b = … and nvar x = …. There again, this will make it difficult to support other types later on.
You will have an easier time overall if you separate the type checking from the parsing. Once you've built an abstract syntax tree, do a pass of type checking (in which you will infer the type of variables.
type numeric_expression = Nconst of float | Nplus of numeric_expression * numeric_expression | …
and boolean_expression = Bconst of bool | Bor of boolean_expression * boolean_expression | …
type typed_expression = Tnum of numeric_expression | Tbool of boolean_expression
type typed_statement = Tvar of string * typed_expression
let rec type_expression : Syntax.expression -> typed_expression = function
| Syntax.Float x -> Tnum (Nconst x)
| Syntax.Plus (e1, e2) ->
begin match type_expression e1, type_expression e2 with
| Tnum n1, Tnum n2 -> Tnum (Nplus (n1, n2))
| _, (Tbool _ as t2) -> raise (Invalid_argument_type ("+", t2))
| (Tbool _ as t1), _ -> raise (Invalid_argument_type ("+", t1))
end
| …

What is a 'semantic predicate' in ANTLR?

What is a semantic predicate in ANTLR?
ANTLR 4
For predicates in ANTLR 4, checkout these stackoverflow Q&A's:
Syntax of semantic predicates in Antlr4
Semantic predicates in ANTLR4?
ANTLR 3
A semantic predicate is a way to enforce extra (semantic) rules upon grammar
actions using plain code.
There are 3 types of semantic predicates:
validating semantic predicates;
gated semantic predicates;
disambiguating semantic predicates.
Example grammar
Let's say you have a block of text consisting of only numbers separated by
comma's, ignoring any white spaces. You would like to parse this input making
sure that the numbers are at most 3 digits "long" (at most 999). The following
grammar (Numbers.g) would do such a thing:
grammar Numbers;
// entry point of this parser: it parses an input string consisting of at least
// one number, optionally followed by zero or more comma's and numbers
parse
: number (',' number)* EOF
;
// matches a number that is between 1 and 3 digits long
number
: Digit Digit Digit
| Digit Digit
| Digit
;
// matches a single digit
Digit
: '0'..'9'
;
// ignore spaces
WhiteSpace
: (' ' | '\t' | '\r' | '\n') {skip();}
;
Testing
The grammar can be tested with the following class:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
ANTLRStringStream in = new ANTLRStringStream("123, 456, 7 , 89");
NumbersLexer lexer = new NumbersLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
NumbersParser parser = new NumbersParser(tokens);
parser.parse();
}
}
Test it by generating the lexer and parser, compiling all .java files and
running the Main class:
java -cp antlr-3.2.jar org.antlr.Tool Numbers.g
javac -cp antlr-3.2.jar *.java
java -cp .:antlr-3.2.jar Main
When doing so, nothing is printed to the console, which indicates that nothing
went wrong. Try changing:
ANTLRStringStream in = new ANTLRStringStream("123, 456, 7 , 89");
into:
ANTLRStringStream in = new ANTLRStringStream("123, 456, 7777 , 89");
and do the test again: you will see an error appearing on the console right after the string 777.
Semantic Predicates
This brings us to the semantic predicates. Let's say you want to parse
numbers between 1 and 10 digits long. A rule like:
number
: Digit Digit Digit Digit Digit Digit Digit Digit Digit Digit
| Digit Digit Digit Digit Digit Digit Digit Digit Digit
/* ... */
| Digit Digit Digit
| Digit Digit
| Digit
;
would become cumbersome. Semantic predicates can help simplify this type of rule.
1. Validating Semantic Predicates
A validating semantic predicate is nothing
more than a block of code followed by a question mark:
RULE { /* a boolean expression in here */ }?
To solve the problem above using a validating
semantic predicate, change the number rule in the grammar into:
number
#init { int N = 0; }
: (Digit { N++; } )+ { N <= 10 }?
;
The parts { int N = 0; } and { N++; } are plain Java statements of which
the first is initialized when the parser "enters" the number rule. The actual
predicate is: { N <= 10 }?, which causes the parser to throw a
FailedPredicateException
whenever a number is more than 10 digits long.
Test it by using the following ANTLRStringStream:
// all equal or less than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,1234567890");
which produces no exception, while the following does thow an exception:
// '12345678901' is more than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,12345678901");
2. Gated Semantic Predicates
A gated semantic predicate is similar to a validating semantic predicate,
only the gated version produces a syntax error instead of a FailedPredicateException.
The syntax of a gated semantic predicate is:
{ /* a boolean expression in here */ }?=> RULE
To instead solve the above problem using gated predicates to match numbers up to 10 digits long you would write:
number
#init { int N = 1; }
: ( { N <= 10 }?=> Digit { N++; } )+
;
Test it again with both:
// all equal or less than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,1234567890");
and:
// '12345678901' is more than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,12345678901");
and you will see the last on will throw an error.
3. Disambiguating Semantic Predicates
The final type of predicate is a disambiguating semantic predicate, which looks a bit like a validating predicate ({boolean-expression}?), but acts more like a gated semantic predicate (no exception is thrown when the boolean expression evaluates to false). You can use it at the start of a rule to check some property of a rule and let the parser match said rule or not.
Let's say the example grammar creates Number tokens (a lexer rule instead of a parser rule) that will match numbers in the range of 0..999. Now in the parser, you'd like to make a distinction between low- and hight numbers (low: 0..500, high: 501..999). This could be done using a disambiguating semantic predicate where you inspect the token next in the stream (input.LT(1)) to check if it's either low or high.
A demo:
grammar Numbers;
parse
: atom (',' atom)* EOF
;
atom
: low {System.out.println("low = " + $low.text);}
| high {System.out.println("high = " + $high.text);}
;
low
: {Integer.valueOf(input.LT(1).getText()) <= 500}? Number
;
high
: Number
;
Number
: Digit Digit Digit
| Digit Digit
| Digit
;
fragment Digit
: '0'..'9'
;
WhiteSpace
: (' ' | '\t' | '\r' | '\n') {skip();}
;
If you now parse the string "123, 999, 456, 700, 89, 0", you'd see the following output:
low = 123
high = 999
low = 456
high = 700
low = 89
low = 0
I've always used the terse reference to ANTLR predicates on wincent.com as my guide.