How to modify unary operators in OCaml? - operators

In OCaml, it is very easy to access to a binary operator, such as "+":
# (+);;
( + ) : int -> int -> int
And modify them as I wish:
# let (+) = (+.);;
( + ) : float -> float -> float
In the Pervasives documentation, it says that (~-) is the unary operator corresponding to (-), meaning that ~-5 returns - :int = -5.
It is easy to modify (~-) as well:
let (~-) = (~-.);;
(~-) : float -> float
Fortunately, OCaml allows the user to use (-) as an alias for (~-):
Suppose we have defined
foo : int -> int -> int
We can call
foo 1 (-1);;
which is way better than
foo 1 (~-1);;
Well, the problem is, when I change (~-) definition, it doesn't affect the unary operator (-)...
let (~-) x = 5;;
~-2;;
- : int = 5
-2;;
- : int = -2
Any idea how to modify the unary (-) as well ?

As you said, unary (-) is a shortcut for (~-). Actually, your change has affected unary (-); for example, you have many ways to use (-) as you want after overriding (~-):
# - (2+0);;
- : int = 0
# let a = 2;;
val a : int = 2
# -a;;
- : int = 0
So it works when you pass an expression to (-). In case you call -2, it will be parsed as a value, not a function application. It makes sense to follow the normal convention of negative numbers.
BTW, I don't recommend you to use this way of overriding operators. Since you have change (-) for any datatype having that operator, it may lead to confusion and strange bugs.

Here's a code extract from the compiler that seems to handle this case. Note that it only works for floating point constants. It's not a general feature of the - operator.
signed_constant:
constant { $1 }
| MINUS INT { Const_int(- $2) }
| MINUS FLOAT { Const_float("-" ^ $2) }
| MINUS INT32 { Const_int32(Int32.neg $2) }
| MINUS INT64 { Const_int64(Int64.neg $2) }
(You'll find this code in parser.mly.)
I don't think there's a way to get what you want without hacking on the compiler at this spot.

Related

Simple arithmetic parser created with Kotlin and better-parse fails

Please have a look at the following parser I implemented in Kotlin using the parser combinator library better-parse. It parses—should parse, rather—simple arithmetic expressions:
object CalcGrammar: Grammar<Int>() {
val num by regexToken("""\d+""")
val pls by literalToken("+")
val min by literalToken("-")
val mul by literalToken("*")
val div by literalToken("/")
val lpr by literalToken("(")
val rpr by literalToken(")")
val wsp by regexToken("\\s+", ignore = true)
// expr ::= term + expr | term - expr | term
// term ::= fact * term | fact / term | fact
// fact ::= (expr) | -fact | int
val fact: Parser<Int> by
skip(lpr) and parser(::expr) and skip(rpr) or
(skip(min) and parser(::fact) map { -it }) or
(num map { it.text.toInt() })
val term by
leftAssociative(fact, mul) { a, _, b -> a * b } or
leftAssociative(fact, div) { a, _, b -> a / b } or
fact
val expr by
leftAssociative(term, pls) { a, _, b -> a + b } or
leftAssociative(term, min) { a, _, b -> a - b } or
term
override val rootParser by expr
}
However, when I parse -2 + 4 - 5 + 6, I get this ParseException:
Could not parse input: UnparsedRemainder(startsWith=min#8 for "-" at 7 (1:8))
At first, I thought the issue was that I have not codified the self-recursion in the expr productions, i.e., the code does not accurately represent the grammar:
// Reference to `expr` is missing on the right-hand side...
val expr = ... leftAssociative(term, min) ...
// ...although the corresponding production defines it.
// expr ::= ... term - expr ...
But then I noticed that the official example for an arithmetic parser provided as part of the library's documentation—which turns out to be almost identical afaict—also omits this.
What did I do wrong if not this? And how can I make it work?
I am not entirely familiar with this library, but it seems your translation from grammar to code is too literal for the library's syntax and the library actually implicitly handles much of what you are explicitly writing, and it seems that as a part of this "ease of use" it breaks what is seemingly correct code.
For starters, you will find that your code behaves the exact same way whether or not you chain or term on to the end of expr, and or fact on to the end of term.
Based on this I was able to come to the conclusion that these ors are not working as expected when chaining together the leftAssociatives, in fact you would run in to the same problem had you attempted to parse a division. I believe it is for this reason that the example you provided a link to combines addition and subtraction (as with multiplication and division) into a single, more dynamic, leftAssociative call. And if you copy that same work to your own code, it runs flawlessly:
object CalcGrammar: Grammar<Int>() {
val num by regexToken("""\d+""")
val pls by literalToken("+")
val min by literalToken("-")
val mul by literalToken("*")
val div by literalToken("/")
val lpr by literalToken("(")
val rpr by literalToken(")")
val wsp by regexToken("\\s+", ignore = true)
// expr ::= term + expr | term - expr | term
// term ::= fact * term | fact / term | fact
// fact ::= (expr) | -fact | int
val fact: Parser<Int> by
skip(lpr) and parser(::expr) and skip(rpr) or
(skip(min) and parser(::fact) map { -it }) or
(num map { it.text.toInt() })
private val term by
leftAssociative(fact, div or mul use { type }) { a, op, b ->
if (op == div) a / b else a * b
}
private val expr by
leftAssociative(term, pls or min use { type }) { a, op, b ->
if (op == pls) a + b else a - b
}
override val rootParser by expr
}

Rational numbers in Raku

I am using Raku for some computations, because it has nice numeric types. However, I have an issue with using '.raku'
say (1/6+1/6).raku
#<1/3>
We obtain this. However,
say (1/10+1/10).raku
#0.2
Is it a bug? I expected <1/5>. What happens?
In Raku, 0.2 constructs a Rat, and thus produces the very same result as writing 1/5 (which will be constant folded) or <1/5> (the literal form). You only get floating point in the case of specifying an exponent (for example, 2e-1).
The .raku (formerly known as the .perl) method's job is to produce something that will roundtrip and produce the same value if EVAL'd. In the case of 1/5, that can be exactly represented as a decimal, so it will produce 0.2. It only resorts to the fractional representation when a decimal form would not round-trip.
You can always recover the numerator and denominator using the .numerator and .denominator methods to format as you wish. Additionally .nude method returns a list of the numerator and denominator, which one can join with a / if wanted:
say (1/6+1/6).nude.join("/"); # 1/3
say (1/10+1/10).nude.join("/"); # 1/5
Hi #milou123 I was also a bit surprised that raku reverts to decimal representation - I can see that some contexts - such as teaching fractional arithmetic would benefit from having a "keep as rat" mode. Having said that, ultimately it makes sense that there is only one way to .raku something and that decimal is the default representation.
Of course, with raku, you can also just change the language a bit. In this case, I have invented a new '→' postfix operator...
multi postfix:<→> ( Rat:D $r ) { $r.nude.join("/") }
say (1/5+1/5)→; # 2/5
I am not smart enough to work out if the built in 'raku' method can be overridden in a similar way, would be keen to see advice on how to do that concisely...
try this in Julia:
julia> 1 // 10 + 1 // 10
1//5
julia> typeof(1 // 10 + 1 // 10)
Rational{Int64}
julia> 1 // 2 + 1 // 3
5//6
julia> typeof(1 // 2 + 1 // 3)
Rational{Int64}
in the Rat.pm6 implemention, we can only call .raku method on Rat type to get the expected format:
multi method raku(Rat:D: --> Str:D) {
if $!denominator == 1 {
$!numerator ~ '.0'
}
else {
my $d = $!denominator;
unless $d == 0 {
$d = $d div 5 while $d %% 5;
$d = $d div 2 while $d %% 2;
}
if $d == 1 and (my $b := self.base(10,*)).Numeric === self {
$b;
}
else {
'<' ~ $!numerator ~ '/' ~ $!denominator ~ '>'
}
}
}

What is the point of coercions like Int(Cool)?

The Perl 6 Web site on functions says
Coercion types can help you to have a specific type inside a routine, but accept wider input. When the routine is called, the argument is automatically converted to the narrower type.
sub double(Int(Cool) $x) {
2 * $x
}
say double '21'; # 42
say double Any; # Type check failed in binding $x; expected 'Cool' but got 'Any'
Here the Int is the target type to which the argument will be coerced, and Cool is the type that the routine accepts as input.
But what is the point for the sub? Isn't $x just an Int? Why would you restrict the caller to implement Cool for the argument?
I'm doubly confused by the example because Int already is Cool. So I did an example where the types don't share a hierarchy:
class Foo { method foomethod { say 'foomethod' } }
class Bar {}
class Quux is Foo {
# class Quux { # compile error
method Bar { Bar.new }
}
sub foo(Bar(Foo) $c) {
say $c.WHAT; # (Bar)
# $c.foomethod # fails if uncommented: Method 'foomethod' not found for invocant of class 'Bar'
}
foo(Quux.new)
Here the invocant of foo is restricted to provide a Foo that can be converted to a Bar but foo cannot even call a method of Foo on $c because its type is Bar. So why would foo care that the to-be-coerced type is a Foo in the first place?
Could someone shed some light on this? Links to appropriate documentation and parts of the spec are appreciated as well. I couldn't find anything useful there.
Update Having reviewed this answer today I've concluded I had completely misunderstood what #musiKk was getting at. This was revealed most clearly in #darch's question and #musiKk's response:
#darch: Or is your question why one might prefer Int(Cool) over Int(Any)? If that's the case, that would be the question to ask.
#musiKk: That is exactly my question. :)
Reviewing the many other answers I see none have addressed it the way I now think it warrants addressing.
I might be wrong of course so what I've decided to do is leave the original question as is, in particular leaving the title as is, and leave this answer as it was, and instead write a new answer addressing #darch's reformulation.
Specify parameter type, with no coercion: Int $x
We could declare:
sub double (Int $x) { ... } # Accept only Int. (No coercion.)
Then this would work:
double(42);
But unfortunately typing 42 in response to this:
double(prompt('')); # `prompt` returns the string the user types
causes the double call to fail with Type check failed in binding $x; expected Int but got Str ("42") because 42, while looking like a number, is technically a string of type Str, and we've asked for no coercion.
Specify parameter type, with blanket coercion: Int() $x
We can introduce blanket coercion of Any value in the sub's signature:
sub double (Int(Any) $x) { ... } # Take Any value. Coerce to an Int.
Or:
sub double (Int() $x) { ... } # Same -- `Int()` coerces from Any.
Now, if you type 42 when prompted by the double(prompt('')); statement, the run-time type-check failure no longer applies and instead the run-time attempts to coerce the string to an Int. If the user types a well-formed number the code just works. If they type 123abc the coercion will fail at run-time with a nice error message:
Cannot convert string to number: trailing characters after number in '123⏏abc'
One problem with blanket coercion of Any value is that code like this:
class City { ... } # City has no Int coercion
my City $city;
double($city);
fails at run-time with the message: "Method 'Int' not found for invocant of class 'City'".
Specify parameter type, with coercion from Cool values: Int(Cool) $x
We can choose a point of balance between no coercion and blanket coercion of Any value.
The best class to coerce from is often the Cool class, because Cool values are guaranteed to either coerce nicely to other basic types or generate a nice error message:
# Accept argument of type Cool or a subclass and coerce to Int:
sub double (Int(Cool) $x) { ... }
With this definition, the following:
double(42);
double(prompt(''));
works as nicely as it can, and:
double($city);
fails with "Type check failed in binding $x; expected Cool but got City (City)" which is arguably a little better diagnostically for the programmer than "Method 'Int' not found for invocant of class 'City'".
why would foo care that the to-be-coerced type is a Foo in the first place?
Hopefully it's now obvious that the only reason it's worth limiting the coerce-from-type to Foo is because that's a type expected to successfully coerce to a Bar value (or, perhaps, fail with a friendly message).
Could someone shed some light on this? Links to appropriate documentation and parts of the spec are appreciated as well. I couldn't find anything useful there.
The document you originally quoted is pretty much all there is for enduser doc. Hopefully it makes sense now and you're all set. If not please comment and we'll go from there.
What this does is accept a value that is a subtype of Cool, and tries to transform it into an Int. At that point it is an Int no matter what it was before.
So
sub double ( Int(Cool) $n ) { $n * 2 }
can really be thought of as ( I think this is how it was actually implemented in Rakudo )
# Int is a subtype of Cool otherwise it would be Any or Mu
proto sub double ( Cool $n ) {*}
# this has the interior parts that you write
multi sub double ( Int $n ) { $n * 2 }
# this is what the compiler writes for you
multi sub double ( Cool $n ) {
# calls the other multi since it is now an Int
samewith Int($n);
}
So this accepts any of Int, Str, Rat, FatRat, Num, Array, Hash, etc. and tries to convert it into an Int before calling &infix:<*> with it, and 2.
say double ' 5 '; # 25
say double 2.5; # 4
say double [0,0,0]; # 6
say double { a => 0, b => 0 }; # 4
You might restrict it to a Cool instead of Any as all Cool values are essentially required to provide a coercion to Int.
( :( Int(Any) $ ) can be shortened to just :( Int() $ ) )
The reason you might do this is that you need it to be an Int inside the sub because you are calling other code that does different things with different types.
sub example ( Int(Cool) $n ) returns Int {
other-multi( $n ) * $n;
}
multi sub other-multi ( Int $ ) { 10 }
multi sub other-multi ( Any $ ) { 1 }
say example 5; # 50
say example 4.5; # 40
In this particular case you could have written it as one of these
sub example ( Cool $n ) returns Int {
other-multi( Int($n) ) * Int($n);
}
sub example ( Cool $n ) returns Int {
my $temp = Int($n);
other-multi( $temp ) * $temp;
}
sub example ( Cool $n is copy ) returns Int {
$n = Int($n);
other-multi( $n ) * $n;
}
None of them are as clear as the one that uses the signature to coerce it for you.
Normally for such a simple function you can use one of these and it will probably do what you want.
my &double = * * 2; # WhateverCode
my &double = * × 2; # ditto
my &double = { $_ * 2 }; # bare block
my &double = { $^n * 2 }; # block with positional placeholder
my &double = -> $n { $n * 2 }; # pointy block
my &double = sub ( $n ) { $n * 2 } # anon sub
my &double = anon sub double ( $n ) { $n * 2 } # anon sub with name
my &double = &infix:<*>.assuming(*,2); # curried
my &double = &infix:<*>.assuming(2);
sub double ( $n ) { $n * 2 } # same as :( Any $n )
Am I missing something? I'm not a Perl 6 expert, but it appears the syntax allows one to specify independently both what input types are permissible and how the input will be presented to the function.
Restricting the allowable input is useful because it means the code will result in an error, rather than a silent (useless) type conversion when the function is called with a nonsensical parameter.
I don't think an example where the two types are not in a hierarchical relationship makes sense.
Per comments on the original question, a better version of #musiKk's question "What is the point of coercions like Int(Cool)?" turned out to be:
Why might one prefer Int(Cool) over Int(Any)?
A corollary, which I'll also address in this answer, is:
Why might one prefer Int(Any) over Int(Cool)?
First, a list of various related options:
sub _Int_strong (Int $) {} # Argument must be Int
sub _Int_cool (Int(Cool) $) {} # Argument must be Cool; Int invoked
sub _Int_weak (Int(Any) $) {} # Argument must be Any; Int invoked
sub _Int_weak2 (Int() $) {} # same
sub _Any (Any $) {} # Argument must be Any
sub _Any2 ( $) {} # same
sub _Mu (Mu $) {} # Weakest typing - just memory safe (Mu)
_Int_strong val; # Fails to bind if val is not an Int
_Int_cool val; # Fails to bind if val is not Cool. Int invoked.
_Int_weak val; # Fails to bind if val is not Any. Int invoked.
_Any val; # Fails to bind if val is Mu
_Mu val; # Will always bind. If val is a native value, boxes it.
Why might one prefer Int(Cool) over Int(Any)?
Because Int(Cool) is slightly stronger typing. The argument must be of type Cool rather than the broader Any and:
Static analysis will reject binding code written to pass an argument that isn't Cool to a routine whose corresponding parameter has the type constraint Int(Cool). If static analysis shows there is no other routine candidate able to accept the call then the compiler will reject it at compile time. This is one of the meanings of "strong typing" explained in the last section of this answer.
If a value is Cool then it is guaranteed to have a well behaved .Int conversion method. So it will not yield a Method not found error at run-time and can be relied on to provide a good error message if it fails to produce a converted to integer value.
Why might one prefer Int(Any) over Int(Cool)?
Because Int(Any) is slightly weaker typing in that the argument can be of any regular type and P6 will just try and make it work:
.Int will be called on an argument that's passed to a routine whose corresponding parameter has the type constraint Int(...) no matter what the ... is. Provided the passed argument has an .Int method the call and subsequent conversion has a chance of succeeding.
If the .Int fails then the error message will be whatever the .Int method produces. If the argument is actually Cool then the .Int method will produce a good error message if it fails to convert to an Int. Otherwise the .Int method is presumably not a built in one and the result will be pot luck.
Why Foo(Bar) in the first place?
And what's all this about weak and strong typing?
An Int(...) constraint on a function parameter is going to result in either:
A failure to type check; or
An.Int conversion of the corresponding argument that forces it to its integer value -- or fails, leaving the corresponding parameter containing a Failure.
Using Wikipedia definitions as they were at the time of writing this answer (2019) this type checking and attempted conversion will be:
strong typing in the sense that a type constraint like Int(...) is "use of programming language types in order to both capture invariants of the code, and ensure its correctness, and definitely exclude certain classes of programming errors";
Currently weak typing in Rakudo in the sense that Rakudo does not check the ... in Int(...) at compile time even though in theory it could. That is, sub double (Int $x) {}; double Date; yields a compile time error (Calling double(Date) will never work) whereas sub double (Int(Cool) $x) {}; double Date; yields a run time error (Type check failed in binding).
type conversion;
weak typing in the sense that it's implicit type conversion in the sense that the compiler will handle the .Int coercion as part of carrying out the call;
explicit type conversion in the sense that the Int(...) constraint is explicitly directing the compiler to do the conversion as part of binding a call;
checked explicit type conversion -- P6 only does type safe conversions/coercions.
I believe the answer is as simple as you may not want to restrict the argument to Int even though you will be treating it as Int within the sub. say for some reason you want to be able to multiply an Array by a Hash, but fail if the args can't be treated as Int (i.e. is not Cool).
my #a = 1,2,3;
my %h = 'a' => 1, 'b' => 2;
say #a.Int; # 3 (List types coerced to the equivalent of .elems when treated as Int)
say %h.Int; # 2
sub m1(Int $x, Int $y) {return $x * $y}
say m1(3,2); # 6
say m1(#a,%h); # does not match
sub m2(Int(Cool) $x, Int(Cool) $y) {return $x * $y}
say m2('3',2); # 6
say m2(#a,%h); # 6
say m2('foo',2); # does not match
of course, you could also do this without the signature because the math operation will coerce the type automatically:
sub m3($x,$y) {return $x * $y}
say m3(#a,%h); # 6
however, this defers your type check to the inside of the sub, which kind of defeats the purpose of a signature and prevents you from making the sub a multi
All subtypes of Cool will be (as Cool requires them to) coerced to an Int. So if an operator or routine internal to your sub only works with Int arguments, you don't have to add an extra statement/expression converting to an Int nor does that operator/routine's code need to account for other subtypes of Cool. It enforces that the argument will be an Int inside of your sub wherever you use it.
Your example is backwards:
class Foo { method foomethod { say 'foomethod' } }
class Bar {}
class Quux is Bar {
method Foo { Foo.new }
}
sub foo(Foo(Bar) $c) {
#= converts $c of type Bar to type Foo
#= returns result of foomethod
say $c.WHAT; #-> (Foo)
$c.foomethod #-> foomethod
}
foo(Quux.new)

antlr rule boolean parameter showing up in syntactic predicate code one level higher, causing compilation errors

I have a grammar that can parse expressions like 1+2-4 or 1+x-y, creating an appropriate structure on the fly which later, given a Map<String, Integer> with appropriate content, can be evaluated numerically (after parsing is complete, i.e. for x or y only known later).
Inside the grammar, there are also places where an expression that can be evaluated on the spot, i.e. does not contain variables, should occur. I figured I could parse these with the same logic, adding a boolean parameter variablesAllowed to the rule, like so:
grammar MiniExprParser;
INT : ('0'..'9')+;
ID : ('a'..'z'| 'A'..'Z')('a'..'z'| 'A'..'Z'| '0'..'9')*;
PLUS : '+';
MINUS : '-';
numexpr returns [Double val]:
expr[false] {$val = /* some evaluation code */ 0.;};
varexpr /*...*/:
expr[true] {/*...*/};
expr[boolean varsAllowed] /*...*/:
e=atomNode[varsAllowed] {/*...*/}
(PLUS e2=atomNode[varsAllowed] {/*...*/}
|MINUS e2=atomNode[varsAllowed] {/*...*/}
)* ;
atomNode[boolean varsAllowed] /*...*/:
(n=INT {/*...*/})
|{varsAllowed}?=> ID {/*...*/}
;
result:
(numexpr) => numexpr {System.out.println("Numeric result: " + $numexpr.val);}
|varexpr {System.out.println("Variable expression: " + $varexpr.text);};
However, the generated Java code does not compile. In the part apparently responsible for the final rule's syntactic predicate, varsAllowed occurs even although the variable is never defined at this level.
/* ... */
else if ( (LA3_0==ID) && ((varsAllowed))) {
int LA3_2 = input.LA(2);
if ( ((synpred1_MiniExprParser()&&(varsAllowed))) ) {
alt3=1;
}
else if ( ((varsAllowed)) ) {
alt3=2;
}
/* ... */
Am I using it wrong? (I am using Eclipse' AntlrIDE 2.1.2 with Antlr 3.5.2.)
This problem is part of the hoisting process the parser uses for prediction. I encountered the same problem and ended up with a member var (or static var for the C target) instead of a parameter.

Shorthand for all bools YES or all bools NO?

Often in my code I need to check whether the state of x amount of bools are all true OR all bools are false. So I do:
BOOL first, second, third;
if((first && second && third) || (!first && !second && !third))
//do something
Being a lazy programmer, I want to know if there is some mathematical shorthand for this kind of query, instead of having to type out this whole thing every time?
The shorthand for all bools the same is testing for (pairwise) equality:
(first==second && second==third)
Of course you can expand this to any number of booleans, having N-1 equality checks joined with the and operator.
If this is something you frequently require then you're better off using an integer and reading bits individually.
For instance, instead of:
BOOL x; // not this
BOOL y; // not this
BOOL z; // not this
...and instead of bit fields (because their layout is implementation-defined):
unsigned int x : 1; // not this
unsigned int y : 1; // not this
unsigned int z : 1; // not this
...use a single field such as:
unsigned int flags; // do this
...and assign every value to a bit; for example:
enum { // do this
FLAG_X = (1 << 0),
FLAG_Y = (1 << 1),
FLAG_Z = (1 << 2),
ALL_FLAGS = 0x07 // "all bits are on"
};
Then, to test "all false" you simply say "if (!flags)" and to test "all true" you simply say "if (flags == ALL_FLAGS)" where ALL_FLAGS is a number that sets all valid bits to 1. Other bitwise operators can be used to set or test individual bits as needed.
Note that this technique has an upper limit of 32 Boolean values before you have to do more (e.g. create an additional integer field to store more bits).
Check if the sum is 0 or equal to the number of bools:
((first + second + third) % 3 == 0)
This works for any number of arguments.
(But don't take this answer serious and do it for real.)
When speaking about predicates, you can usually simplify the logic by using two variables for the quantification operations - universal quantification (for all) and existential quantification (there exists).
BOOL allValues = (value1 && value2 && value3);
BOOL anyValue = (value1 || value2 || value3);
if (allValues || !anyValue) {
... do something
}
This would also work if you have a lot of boolean values in an array - you could create a for cycle evaluating the two variables.