Shunting yard (Reverse Polish Notation/Postfix) operator precedence - operators

I'm trying to figure out what the precedence is for the different operators when implementing the shuting yard algorithm.
My abstract syntax tree is in infix and I'm evaluating using the shuting yard algorithm. This works just fine for the arithmetic operators. The issue I'm facing is that I don't know what precedence all the other operators has.
From https://en.wikipedia.org/wiki/Shunting-yard_algorithm I can see that the following is true for these operators. The number is the precedence.
^ 4
* 3
/ 3
+ 2
− 2
But I cannot seem to find anything that describes the precedense for the relational and logical operators? I've searched alot for an answer.
Can somebody give me the complete picture of the precedense for all of these operators:
a. Function call
b. (
c. ,
d. +, -
e. *, /
f. ^
g. =, <>, <, <=, >, >=
h. NOT
i. AND
j. OR
Thanks in advance.
/Brian

Take a look at the Mathematica Operator Input Forms which shows the operator input forms, in order of decreasing precedence. Operators of equal precedence are grouped together.
You can determine the "precedence" like this in Mathematica:
Precedence[Power] gives 590
Precedence[Times] gives 400
Precedence[Plus] gives 310
Precedence[Equal] gives 290
Precedence[Not] gives 230
Precedence[And] gives 215
Precedence[Or] gives 215

Related

Kotlin: Why these two implementations of log base 10 give different results on the specific imputs?

println(log(it.toDouble(), 10.0).toInt()+1) // n1
println(log10(it.toDouble()).toInt() + 1) // n2
I had to count the "length" of the number in n-base for non-related to the question needs and stumbled upon a bug (or rather unexpected behavior) that for it == 1000 these two functions give different results.
n1(1000) = 3,
n2(1000) = 4.
Checking values before conversion to int resulted in:
n1_double(1000) = 3.9999999999999996,
n2_double(1000) = 4.0
I understand that some floating point arithmetics magic is involved, but what is especially weird to me is that for 100, 10000 and other inputs that I checked n1 == n2.
What is special about it == 1000? How I ensure that log gives me the intended result (4, not 3.99..), because right now I can't even figure out what cases I need to double-check, since it is not just powers of 10, it is 1000 (and probably some other numbers) specifically.
I looked into implementation of log() and log10() and log is implemented as
if (base <= 0.0 || base == 1.0) return Double.NaN
return nativeMath.log(x) / nativeMath.log(base) //log() here is a natural logarithm
while log10 is implemented as
return nativeMath.log10(x)
I suspect this division in the first case is the reason of an error, but I can't figure out why it causes an error only in specific cases.
I also found this question:
Python math.log and math.log10 giving different results
But I already know that one is more precise than another. However there is no analogy for log10 for some base n, so I'm curious of reason WHY it is specifically 1000 that goes wrong.
PS: I understand there are methods of calculating length of a number without fp arithmetics and log of n-base, but at this point it is a scientific curiosity.
but I can't figure out why it causes an error only in specific cases.
return nativeMath.log(x) / nativeMath.log(base)
//log() here is a natural logarithm
Consider x = 1000 and nativeMath.log(x). The natural logarithm is not exactly representable. It is near
6.90775527898213_681... (Double answer)
6.90775527898213_705... (closer answer)
Consider base = 10 and nativeMath.log(base). The natural logarithm is not exactly representable. It is near
2.302585092994045_901... (Double)
2.302585092994045_684... (closer answer)
The only exactly correct nativeMath.log(x) for a finite x is when x == 1.0.
The quotient of the division of 6.90775527898213681... / 2.302585092994045901... is not exactly representable. It is near 2.9999999999999995559...
The conversion of the quotient to text is not exact.
So we have 4 computation errors with the system giving us a close (rounded) result instead at each step.
Sometimes these rounding errors cancel out in a way we find acceptable and the value of "3.0" is reported. Sometimes not.
Performed with higher precision math, it is easy to see log(1000) was less than a higher precision answer and that log(10) was more. These 2 round-off errors in the opposite direction for a / contributed to the quotient being extra off (low) - by 1 ULP than hoped.
When log(x, 10) is computed for other x = power-of-10, and the log(x) is slightly more than than a higher precision answer, I'd expect the quotient to less often result in a 1 ULP error. Perhaps it will be 50/50 for all powers-of-10.
log10(x) is designed to compute the logarithm in a different fashion, exploiting that the base is 10.0 and certainly exact for powers-of-10.

what is the regular expression for the automaton?

In this automaton how to find regular expression
There are three methods basically. Two are given in the book by Hopcroft and Ullman.
1) Calucalte R_ij recursively for all states i,j going through only k intermediate states.
2) Calculate regular expression by eliminating states between start state and a single final state. This is simpler than the first method.
3) Analyse and try to understand the DFA itself.
If we eliminate the middle state we have following
a_02 = (aa+ba)*(ab+bb) : Regular expression from state 0 to 2 without using arrow from state 2 to 0.
a_00 = (aa+ba)* : Regular expression from state 0 to 0 without using arrow from state 2 to 0.
We can now use the expression given in the textbook but at this stage we can also analyse the automata and come up with a solution. So finally R_02 becomes
((aa+ba)*(ab+bb)(a+b))*(aa+ba)*(ab+bb)

is it possible to separate the concept of precedence and association in yacc

I would like to have a clear example of precedence and one of associativity in yacc, but I find myself yet in having troubles separating these two concepts.
Perhaps this is due to the fact that I'm associating these two concepts to math and mathematical operation.. These are two old examples I built:
Associativity (*) is used to specify the kind of association to be applied (left,right, non assoc....)
In fact
%left '+' '*'
instruct that plus and multiplication are left associative. So far, so good. (not exactly but it serve the purpose of the example)
Precedence (**) is used to give precedence to one operator over another.
%left '+'
%left '*'
the multiplication has higher precedence than plus operation.
So we got the wanted parsing action for E+E*E
E+(E*E) in case of (**)
(E+E)*E in case of only (*) --> this is clearly wrong - but it's fine for the example
So question is, can I separate clearly associativity from precedence without using the concept of associativity?
Even non-associatity implies associativity knowledge… so.. how, if possible, can I talk separately about them?
No. In a parser definition, associativity is just a small detail within the precedence algorithm.
To understand that, it's important to understand what precedence actually means, in parsing terms.
A left-to-right shift-reduce parser has a stack and an input stream. Initially, the stack is empty, and the input stream contains the input to be parsed. The SR parser repeatedly does one of the following two actions until the stack consists only of the start symbol and the input stream is empty (in which case the parse has succeeded), or neither action is possible (in which case the parse has failed):
reduce the production whose right-hand side is on the top of the stack by popping the right-hand side off of the stack and pushing the left-hand side non-terminal;
shift one input symbol from the input onto the stack.
It's an important feature of this framework that reductions can only occur when the production's right-hand side is on the top of the stack.
The shift action is always possible unless the input stream is exhausted, but a reduce action can only be taken if the top of the stack precisely matches the right-hand side of some production.
Different ways of building SR parsers will involve different mechanisms for deciding which action to take in any given stack configuration. One such mechanism is the precedence algorithm. Some very simple languages can be SR parsed only with the precedence algorithm. In other cases, it can be used as an auxiliary decision algorithm in order to resolve ambiguous grammar specifications; this is the use case for precedence in yacc-derived parser generators.
For precedence to work, it is necessary that at most one reduction action be possible in any stack configuration, which means that there cannot be two productions with the same right-hand side. [Note 1]
Given that there is at most one possible reduction action and at most one possible shift action (since the next input symbol, if any, is given), the only issue is deciding whether to shift or reduce. The precedence algorithm involves a precedence function PREC(A→α, a) ⇒ { SHIFT, REDUCE }, whose arguments are a production A→α and a terminal symbol a, which are mapped onto either SHIFT or REDUCE.
Although the precedence relationship is usually written as though it were a comparison, it is not a normal comparison operator because the two arguments are from different domains. It always involves a production and a terminal.
In simple cases, however, it is possible to implement PREC using numeric comparisons. To do that, we define two functions which map productions and terminals, respectively, onto integers: f(A→α) and g(a). We use those to compute PREC:
PREC(A→α, a) ≡
REDUCE if f(A→α) > g(a)
SHIFT if f(A→α) < g(a)
[Note 2]
In any event, the precedence algorithm for a given stack configuration is:
Identify the production P (=A→α) of the possible reduce action, if any.
If only a shift or only a reduce is possible, do that. Otherwise, if both a reduce and a shift are possible, compute PREC(P, input) and reduce using P if the result is REDUCE; otherwise, shift input.
Now that might seem confusing, since most descriptions of precedence relations describe them as though they compared terminals, rather than a production with a terminal. That's because it is normal to "name" each production using the last terminal in the production. Usually, that is unambiguous, because of the restriction on production right-hand sides: since two right-hand side must differ, it is likely that all production right-hand sides have different terminal symbols. [Note 3]
Although that short-hand allows us to say, for example, that "* has higher precedence than +" instead of the somewhat more cumbersome "the production E→E*E has precedence over the terminal +", it is important to remember that the latter statement is what we really mean.
Precedence also applies to single operators. With most operators, we prefer to group from left to right, so that E-E-E should be parsed as though it had been written (E-E)-E. However, some operators like exponentiation group to the right, meaning that E**E**E should be parsed as E**(E**E). This is simple to define using the PREC function; for a left-grouping operator ⊕, we'll have:
PREC(E→E⊕E, ⊕) ≡ REDUCE
while a right-grouping operator ⊗ would have
PREC(E→E⊗E, ⊗) ≡ SHIFT
That's clear when we use the actual arguments to PREC, but it becomes confusing when we use the shorthand notation, which leaves us trying to say that ⊕ has higher precedence than ⊕ while ⊗ has lower precedence than ⊗. To avoid the ambiguity and still let us get away with the shorthand, we describe ⊕ as "left-associative" (%left) and ⊗ as "right-associative" (%right). But the implementation is simply an application of the normal precedence algorithm.
As an example, consider the simple expression language:
E → E + E
E → E * E
E → E ** E
E → id
Here we expect * to bind more tightly than + with ** binding tightest; the first two group to the left while exponentiation groups to the right. To achieve that, we can assign f and g functions as follows:
Production f(Production) Terminal g(Terminal)
E → E + E 2 + 1
E → E * E 4 * 3
E → E ** E 5 ** 6
E → id 8 id 7
Yacc-generated grammars don't use precedence to decide when to reduce the E→id production, but the above will work since the grammar can be parsed completely using only the precedence algorithm.
Parentheses can easily be added; I'll leave that as an exercise.
Notes
There might be some other mechanism to decide between reduction actions, so the restriction is only absolute for a parser which only uses precedence. There might also be some other mechanism to restrict possible shift actions. For example, for a shift to be feasible, the tokens on the top of the stack need to eventually be reduced, which means that some suffix of the stack must be a prefix of the right-hand side of some production. Similarly, a reduction is only feasible if, post-reduce, some suffix of the stack is the prefix of the right-hand side of some production.
You'll see formulations using < and ≥ (or ≤ and >), but to avoid confusion, I'm assuming that the ranges of f and g are different sets of integers. Since the functions are arbitrary, this does not restrict generality.
That's not always the case. For example, languages which allow - to be either a unary or a binary operator will have productions with right-hand sides - E and E - E. Yacc-derived parser generators use the %prec TERMINAL declaration to associate a production with a terminal other than the default.
This is all very confused.
Associativity ... Is used to give precedence to one operator over another
No. Absolutely not. Associativity is used to determine which order two adjacent instances of the same operator are evaluated in. (E+E)+E or E+(E+E). All arithmetic operators except exponentiation are left-associative in mathematics.
%left '+' '*'
This says that + and * are both left-associative and have the same precedence, because they are both on the same line. And it is therefore wrong.
can I separate clearly associativity from precedence without using the concept of associativity
I'm sorry but this is just meaningless.

Compiler Type Promotion of Right Hand Side expressions automatically in an Assignment Statement

Why does a compiler not type promote all evaluations of expressions in the right hand side of an assignment expression to at least the left hand sides type level?
e.g.
"double x = (88.0 - 32) * 5 / 9" converts to Celsius from Fahrenheit correctly but...
"double x = (88.0 - 32) * (5 / 9)" will not.
My question is not why the second example does not return the desired result. My question is why does the compiler not type promote the evaluation of (5/9) to that of a double.
Why does a compiler not type promote all evaluations of expressions in
the right hand side of an assignment expression to at least the left
hand sides type level?
Very good question. Actually,let's suppose for sometime that the compiler does this automatically. Now, taking your example :-
double x = 88.0 - 32 * 5 / 9
Now the RHS part of this assignment can be converted into double completely for all tokens(lexemes) in several of ways. I am adding some of them :-
88.0 - 32 * (double)(5 / 9)
88.0 - 32 * 5 / 9 // default rule
88.0 - (double)(32 * 5) / 9
Individually type-casting to double every token which doesn't seem to be a double entity.
Several other ways.
This turns to combinatorial problem like "In how many ways a given expression can be reduced to double(whatever type)?"
But, the compiler designers wouldn't take such a pain in their *** to convert each of the tokens to the desired highest type(double here) considering the exhaustive use of memory. Also it appears like an unnatural rationale behind it doing this way for no reason because users could better perform the operation by manually giving some hints to the compiler that it has to typecast using the way coded by the user.
Being everything automatic conversion is not going to yield you the result always, as sometimes what a user wants may not be achieved with this kind of rationale of automatic type promotion, BUT, the vice-versa of type-promoting will serve in a much better way as is done by the compilers today. Current rule for type-casting is serving all the purposes correctly, though with some extra effort, but, FLAWLESSLY.

How different programming languages handle division by 0?

Perhaps this is the wrong sort of question to ask here but I am curious. I know that many languages will simply explode and fail when asked to divide by 0, but are there any programming languages that can intelligently handle this impossible sum - and if so, what do they do? Do they keep processing, treating 350/0 as 350, or stop execution, or what?
The little-known Java programming language gives the special constant Double.POSITIVE_INFINITY or Double.NEGATIVE_INFINITY (depending on the numerator) when you divide by zero in an IEEE floating-point context. Integer division by zero is undefined, and results in an ArithmeticException being thrown, which is quite different from your scenario of "explosion and failure".
The INTERCAL standard library returns #0 on divide by zero
From Wikipedia:
The infinities of the extended real number line can be represented in IEEE floating point datatypes, just like ordinary floating point values like 1, 1.5 etc. They are not error values in any way, though they are often (but not always, as it depends on the rounding) used as replacement values when there is an overflow. Upon a divide by zero exception, a positive or negative infinity is returned as an exact result.
In Java, division by zero in a floating-point context produces the special value Double.POSITIVE_INFINITY or Double.NEGATIVE_INFINITY.
i'd be surprised if any language returns 350 if you do 350/0. Just two examples, but Java throws an Exception that can be caught. C/C++ just crashes (i think it throws a Signal that can probably be caught).
In Delphi, it either throw a compile-time error (if divided by a 0 value const) or a catchable runtime error if it happens at runtime.
It's the same for C and C++.
In PHP you will get a warning:
Warning: Division by zero in
<file.php> on line X
So, in PHP, for something like:
$i = 123 / 0;
$i will be set to nothing. BUT $i is not === NULL and isset($i) returns true and is_string($i) returns false.
Python (at least version 2, I don't have 3) throws a ZeroDivisionError, which can be caught.
num = 42
try:
for divisor in (1,0):
ans = num / divisor
print ans
except ZeroDivisionError:
print "Trying to divide by 0!"
prints out:
42
Trying to divide by 0!
Most SQL implementations raise a "division by zero" error, but MySQL just returns NULL
Floating point numbers as per the IEEE define constants NaN etc. Any continued operation involving thst value will remain unchanged until the end. Integer or whole numbers are different with exceptions being thrown...In java...
In pony division by 0 is 0 but i have yet to find a language where 0/0 is 1
I'm working with polyhedra and trying to choose a language that likes inf.
The total edges for a polyhedron {a,b} where a is edges per polygon and b is edges per corner is
E = 1/(1/a + 1/b - 1/2)
if E is negative it's a negative curvature, but if E is infinity (1/0) it tiles the plane. Examples: {3,6} {4,4}