OCaml newbie here.
I am trying to figure out how to deal with integer overflow in OCaml when converting from float to int.
I was hoping to use try ... with ... or compare it to nan (since at actually has to return nan when float is too large?), but looks like it does not throw any errors.
And even more surprisingly for very large floats int_of_float simply returns 0.
utop # 0 = int_of_float 9999999999999999999.0;;
- : bool = true
utop # int_of_float 9999999999999999999.0;;
- : int = 0
How do I handle float to int conversion properly? (and more generally int overflow?)
Indeed, OCaml's manual indicates that float_of_int's "result is unspecified if the argument is nan or falls outside the range of representable integers."
A possibility is to check beforehand whether your float will fit or not, and return an option or raise an exception, as in e.g.
let safe_int_of_float f =
if classify_float f = FP_nan then None
else if f >= float_of_int max_int then None
else if f <= float_of_int min_int then None
else Some (int_of_float f)
The following module will do the job.
(*
This module defines the range of integers that can be represented
interchangeably with the float or with int type.
Note that this range depends on size of the int type, which depends
on the compilation target.
*)
(*
OCaml ints have 31 bits or 63 bits.
OCaml floats are double-precision floats (IEEE-754 binary64).
*)
let int_range_min_float = max (-.2.**53.) (float min_int)
let int_range_max_float = min (2.**53.) (float max_int)
let exact_int_of_float x =
if x >= int_range_min_float && x <= int_range_max_float then
truncate x
else
invalid_arg "exact_int_of_float: out of range"
let int_range_min_int = exact_int_of_float int_range_min_float
let int_range_max_int = exact_int_of_float int_range_max_float
let exact_float_of_int x =
if x >= int_range_min_int && x <= int_range_max_int then
float x
else
invalid_arg "exact_float_of_int: out of range"
let test () =
let imin, imax = int_range_min_int, int_range_max_int in
let fmin, fmax = int_range_min_float, int_range_max_float in
assert (exact_int_of_float 1. = 1);
assert (exact_int_of_float (-1.) = -1);
assert (fmin < 0.);
assert (fmax > 0.);
assert (imin < 0);
assert (imax > 0);
assert (float (truncate fmin) = fmin);
assert (float (truncate fmax) = fmax);
assert (truncate (float imin) = imin);
assert (truncate (float imax) = imax);
assert (float (imin + 1) = float imin +. 1.);
assert (float (imin + 2) = float imin +. 2.);
assert (float (imax - 1) = float imax -. 1.);
assert (float (imax - 2) = float imax -. 2.)
Some algorithms (allocate a binary tree...) need to compute a base 2 exponential. How to compute it for this native type?
newtype {:nativeType "uint"} u32 =
x: nat | 0 <= x < 2147483648
This is an obvious try:
function pow2(n: u32): (r: u32)
requires n < 10
{
if n == 0 then 1 else 2 * pow2(n - 1)
}
It fails because Dafny doubts that the product stays below u32's max value. How to prove that it's value is below 2**10?
In this case, it is more convenient to first define the unbounded version of the function, and then prove a lemma showing that when n < 10 (or n < 32, even) it is in bounds.
function pow2(n: nat): int
{
if n == 0 then 1 else 2 * pow2(n - 1)
}
lemma pow2Bounds(n: nat)
requires n < 32
ensures 0 <= pow2(n) < 0x100000000
{ /* omitted here; two proofs given below */ }
function pow2u32(n: u32): u32
requires n < 32
{
pow2Bounds(n as nat);
pow2(n as nat) as u32
}
Intuitively, we might expect the lemma to go through automatically, because there are only a small number of cases to consider: n = 0, n = 1, ... n = 31. But Dafny will not perform such case analysis automatically. Instead, we have a couple of options.
First proof
First, we can prove a more general property, which, by the magic of inductive reasoning, is easier to prove, despite being stronger than what we need.
lemma pow2Monotone(a: nat, b: nat)
requires a < b
ensures pow2(a) < pow2(b)
{} // Dafny is able to prove this automatically by induction.
The lemma then follows.
lemma pow2Bounds(n: nat)
requires n < 32
ensures 0 <= pow2(n) < 0x100000000
{
pow2Monotone(n, 32);
}
Second proof
Another way to prove it is to tell Dafny it should unroll pow2 up to 32 times, using a :fuel attribute. These 32 unrollings are essentially the same as asking Dafny to do case analysis on each possible value. Dafny can then complete the proof without additional help.
lemma {:fuel pow2,31,32} pow2Bounds(n: nat)
requires n < 32
ensures 0 <= pow2(n) < 0x100000000
{}
The :fuel attribute is (lightly) documented in the Dafny Reference Manual in Section 24.
A bit of a cheat, but with so narrow a domain, this works very well.
const pow2: seq<u32> :=
[0x1, 0x2, 0x4, 0x8, 0x10, 0x20];
lemma pow2_exponential(n: u32)
ensures n == 0 ==> pow2[n] == 1
ensures 0 < n < 6 ==> pow2[n] == 2 * pow2[n - 1]
{}
I'm using ghci and I'm having a problem with a function for getting the factors of a number.
The code I would like to work is:
let factors n = [x | x <- [1..truncate (n/2)], mod n x == 0]
It doesn't complain when I then hit enter, but as soon as I try to use it (with 66 in this case) I get this error message:
Ambiguous type variable 't0' in the constraints:
(Integral t0)
arising from a use of 'factors' at <interactive>:30:1-10
(Num t0) arising from the literal '66' at <interactive>:30:12-13
(RealFrac t0)
arising from a use of 'factors' at <interactive:30:1-10
Probable fix: add a type signature that fixes these type variable(s)
In the expression: factors 66
In the equation for 'it': it = factors 66
The following code works perfectly:
let factorsOfSixtySix = [x | x <- [1..truncate (66/2)], mod 66 x == 0]
I'm new to haskell, and after looking up types and typeclasses, I'm still not sure what I'm meant to do.
Use div for integer division instead:
let factors n = [x | x <- [1.. n `div` 2], mod n x == 0]
The problem in your code is that / requires a RealFrac type for n while mod an Integral one. This is fine during definition, but then you can not choose a type which fits both constraints.
Another option could be to truncate n before using mod, but is more cumbersome. After all, you do not wish to call factors 6.5, do you? ;-)
let factors n = [x | x <- [1..truncate (n/2)], mod (truncate n) x == 0]
If you put a type annotation on this top-level bind (idiomatic Haskell), you get different, possibly more useful error messages.
GHCi> let factors n = [x | x <- [1..truncate (n/2)], mod n x == 0]
GHCi> :t factors
factors :: (Integral t, RealFrac t) => t -> [t]
GHCi> let { factors :: Double -> [Double]; factors n = [x | x <- [1..truncate (n/2)], mod n x == 0]; }
<interactive>:30:64:
No instance for (Integral Double) arising from a use of `truncate'
Possible fix: add an instance declaration for (Integral Double)
In the expression: truncate (n / 2)
In the expression: [1 .. truncate (n / 2)]
In a stmt of a list comprehension: x <- [1 .. truncate (n / 2)]
GHCi> let { factors :: Integer -> [Integer]; factors n = [x | x <- [1..truncate (n/2)], mod n x == 0]; }
<interactive>:31:66:
No instance for (RealFrac Integer) arising from a use of `truncate'
Possible fix: add an instance declaration for (RealFrac Integer)
In the expression: truncate (n / 2)
In the expression: [1 .. truncate (n / 2)]
In a stmt of a list comprehension: x <- [1 .. truncate (n / 2)]
<interactive>:31:77:
No instance for (Fractional Integer) arising from a use of `/'
Possible fix: add an instance declaration for (Fractional Integer)
In the first argument of `truncate', namely `(n / 2)'
In the expression: truncate (n / 2)
In the expression: [1 .. truncate (n / 2)]
I am new to Haskell so please forgive my courage to come up with an answer here but recently i have done this as follows;
factors :: Int -> [Int]
factors n = f' ++ [n `div` x | x <- tail f', x /= exc]
where lim = truncate (sqrt (fromIntegral n))
exc = ceiling (sqrt (fromIntegral n))
f' = [x | x <- [1..lim], n `mod` x == 0]
I believe it's more efficient. You will notice if you do like;
sum (factors 33550336)
A friend and I are going back and forth with brain-teasers and I have no idea how to solve this one. My assumption is that it's possible with some bitwise operators, but not sure.
In C, with bitwise operators:
#include<stdio.h>
int add(int x, int y) {
int a, b;
do {
a = x & y;
b = x ^ y;
x = a << 1;
y = b;
} while (a);
return b;
}
int main( void ){
printf( "2 + 3 = %d", add(2,3));
return 0;
}
XOR (x ^ y) is addition without carry. (x & y) is the carry-out from each bit. (x & y) << 1 is the carry-in to each bit.
The loop keeps adding the carries until the carry is zero for all bits.
int add(int a, int b) {
const char *c=0;
return &(&c[a])[b];
}
No + right?
int add(int a, int b)
{
return -(-a) - (-b);
}
CMS's add() function is beautiful. It should not be sullied by unary negation (a non-bitwise operation, tantamount to using addition: -y==(~y)+1). So here's a subtraction function using the same bitwise-only design:
int sub(int x, int y) {
unsigned a, b;
do {
a = ~x & y;
b = x ^ y;
x = b;
y = a << 1;
} while (a);
return b;
}
Define "best". Here's a python version:
len(range(x)+range(y))
The + performs list concatenation, not addition.
Java solution with bitwise operators:
// Recursive solution
public static int addR(int x, int y) {
if (y == 0) return x;
int sum = x ^ y; //SUM of two integer is X XOR Y
int carry = (x & y) << 1; //CARRY of two integer is X AND Y
return addR(sum, carry);
}
//Iterative solution
public static int addI(int x, int y) {
while (y != 0) {
int carry = (x & y); //CARRY is AND of two bits
x = x ^ y; //SUM of two bits is X XOR Y
y = carry << 1; //shifts carry to 1 bit to calculate sum
}
return x;
}
Cheat. You could negate the number and subtract it from the first :)
Failing that, look up how a binary adder works. :)
EDIT: Ah, saw your comment after I posted.
Details of binary addition are here.
Note, this would be for an adder known as a ripple-carry adder, which works, but does not perform optimally. Most binary adders built into hardware are a form of fast adder such as a carry-look-ahead adder.
My ripple-carry adder works for both unsigned and 2's complement integers if you set carry_in to 0, and 1's complement integers if carry_in is set to 1. I also added flags to show underflow or overflow on the addition.
#define BIT_LEN 32
#define ADD_OK 0
#define ADD_UNDERFLOW 1
#define ADD_OVERFLOW 2
int ripple_add(int a, int b, char carry_in, char* flags) {
int result = 0;
int current_bit_position = 0;
char a_bit = 0, b_bit = 0, result_bit = 0;
while ((a || b) && current_bit_position < BIT_LEN) {
a_bit = a & 1;
b_bit = b & 1;
result_bit = (a_bit ^ b_bit ^ carry_in);
result |= result_bit << current_bit_position++;
carry_in = (a_bit & b_bit) | (a_bit & carry_in) | (b_bit & carry_in);
a >>= 1;
b >>= 1;
}
if (current_bit_position < BIT_LEN) {
*flags = ADD_OK;
}
else if (a_bit & b_bit & ~result_bit) {
*flags = ADD_UNDERFLOW;
}
else if (~a_bit & ~b_bit & result_bit) {
*flags = ADD_OVERFLOW;
}
else {
*flags = ADD_OK;
}
return result;
}
Go based solution
func add(a int, b int) int {
for {
carry := (a & b) << 1
a = a ^ b
b = carry
if b == 0 {
break
}
}
return a
}
same solution can be implemented in Python as follows, but there is some problem about number represent in Python, Python has more than 32 bits for integers. so we will use a mask to obtain the last 32 bits.
Eg: if we don't use mask we won't get the result for numbers (-1,1)
def add(a,b):
mask = 0xffffffff
while b & mask:
carry = a & b
a = a ^ b
b = carry << 1
return (a & mask)
Why not just incremet the first number as often, as the second number?
The reason ADD is implememted in assembler as a single instruction, rather than as some combination of bitwise operations, is that it is hard to do. You have to worry about the carries from a given low order bit to the next higher order bit. This is stuff that the machines do in hardware fast, but that even with C, you can't do in software fast.
Here's a portable one-line ternary and recursive solution.
int add(int x, int y) {
return y == 0 ? x : add(x ^ y, (x & y) << 1);
}
I saw this as problem 18.1 in the coding interview.
My python solution:
def foo(a, b):
"""iterate through a and b, count iteration via a list, check len"""
x = []
for i in range(a):
x.append(a)
for i in range(b):
x.append(b)
print len(x)
This method uses iteration, so the time complexity isn't optimal.
I believe the best way is to work at a lower level with bitwise operations.
In python using bitwise operators:
def sum_no_arithmetic_operators(x,y):
while True:
carry = x & y
x = x ^ y
y = carry << 1
if y == 0:
break
return x
Adding two integers is not that difficult; there are many examples of binary addition online.
A more challenging problem is floating point numbers! There's an example at http://pages.cs.wisc.edu/~smoler/x86text/lect.notes/arith.flpt.html
Was working on this problem myself in C# and couldn't get all test cases to pass. I then ran across this.
Here is an implementation in C# 6:
public int Sum(int a, int b) => b != 0 ? Sum(a ^ b, (a & b) << 1) : a;
Implemented in same way as we might do binary addition on paper.
int add(int x, int y)
{
int t1_set, t2_set;
int carry = 0;
int result = 0;
int mask = 0x1;
while (mask != 0) {
t1_set = x & mask;
t2_set = y & mask;
if (carry) {
if (!t1_set && !t2_set) {
carry = 0;
result |= mask;
} else if (t1_set && t2_set) {
result |= mask;
}
} else {
if ((t1_set && !t2_set) || (!t1_set && t2_set)) {
result |= mask;
} else if (t1_set && t2_set) {
carry = 1;
}
}
mask <<= 1;
}
return (result);
}
Improved for speed would be below::
int add_better (int x, int y)
{
int b1_set, b2_set;
int mask = 0x1;
int result = 0;
int carry = 0;
while (mask != 0) {
b1_set = x & mask ? 1 : 0;
b2_set = y & mask ? 1 : 0;
if ( (b1_set ^ b2_set) ^ carry)
result |= mask;
carry = (b1_set & b2_set) | (b1_set & carry) | (b2_set & carry);
mask <<= 1;
}
return (result);
}
It is my implementation on Python. It works well, when we know the number of bytes(or bits).
def summ(a, b):
#for 4 bytes(or 4*8 bits)
max_num = 0xFFFFFFFF
while a != 0:
a, b = ((a & b) << 1), (a ^ b)
if a > max_num:
b = (b&max_num)
break
return b
You can do it using bit-shifting and the AND operation.
#include <stdio.h>
int main()
{
unsigned int x = 3, y = 1, sum, carry;
sum = x ^ y; // Ex - OR x and y
carry = x & y; // AND x and y
while (carry != 0) {
carry = carry << 1; // left shift the carry
x = sum; // initialize x as sum
y = carry; // initialize y as carry
sum = x ^ y; // sum is calculated
carry = x & y; /* carry is calculated, the loop condition is
evaluated and the process is repeated until
carry is equal to 0.
*/
}
printf("%d\n", sum); // the program will print 4
return 0;
}
The most voted answer will not work if the inputs are of opposite sign. The following however will. I have cheated at one place, but only to keep the code a bit clean. Any suggestions for improvement welcome
def add(x, y):
if (x >= 0 and y >= 0) or (x < 0 and y < 0):
return _add(x, y)
else:
return __add(x, y)
def _add(x, y):
if y == 0:
return x
else:
return _add((x ^ y), ((x & y) << 1))
def __add(x, y):
if x < 0 < y:
x = _add(~x, 1)
if x > y:
diff = -sub(x, y)
else:
diff = sub(y, x)
return diff
elif y < 0 < x:
y = _add(~y, 1)
if y > x:
diff = -sub(y, x)
else:
diff = sub(y, x)
return diff
else:
raise ValueError("Invalid Input")
def sub(x, y):
if y > x:
raise ValueError('y must be less than x')
while y > 0:
b = ~x & y
x ^= y
y = b << 1
return x
Here is the solution in C++, you can find it on my github here: https://github.com/CrispenGari/Add-Without-Integers-without-operators/blob/master/main.cpp
int add(int a, int b){
while(b!=0){
int sum = a^b; // add without carrying
int carry = (a&b)<<1; // carrying without adding
a= sum;
b= carry;
}
return a;
}
// the function can be writen as follows :
int add(int a, int b){
if(b==0){
return a; // any number plus 0 = that number simple!
}
int sum = a ^ b;// adding without carrying;
int carry = (a & b)<<1; // carry, without adding
return add(sum, carry);
}
This can be done using Half Adder.
Half Adder is method to find sum of numbers with single bit.
A B SUM CARRY A & B A ^ B
0 0 0 0 0 0
0 1 1 0 0 1
1 0 1 0 0 1
1 1 0 1 0 0
We can observe here that SUM = A ^ B and CARRY = A & B
We know CARRY is always added at 1 left position from where it was
generated.
so now add ( CARRY << 1 ) in SUM, and repeat this process until we get
Carry 0.
int Addition( int a, int b)
{
if(B==0)
return A;
Addition( A ^ B, (A & B) <<1 )
}
let's add 7 (0111) and 3 (0011) answer will be 10 (1010)
A = 0100 and B = 0110
A = 0010 and B = 1000
A = 1010 and B = 0000
final answer is A.
I implemented this in Swift, I am sure someone will benefit from
var a = 3
var b = 5
var sum = 0
var carry = 0
while (b != 0) {
sum = a ^ b
carry = a & b
a = sum
b = carry << 1
}
print (sum)
You can do it iteratively or recursively. Recursive:-
public int getSum(int a, int b) {
return (b==0) ? a : getSum(a^b, (a&b)<<1);
}
Iterative:-
public int getSum(int a, int b) {
int c=0;
while(b!=0) {
c=a&b;
a=a^b;
b=c<<1;
}
return a;
}
time complexity - O(log b)
space complexity - O(1)
for further clarifications if not clear, refer leetcode or geekForGeeks explanations.
I'll interpret this question as forbidding the +,-,* operators but not ++ or -- since the question specified operator and not character (and also because that's more interesting).
A reasonable solution using the increment operator is as follows:
int add(int a, int b) {
if (b == 0)
return a;
if (b > 0)
return add(++a, --b);
else
return add(--a, ++b);
}
This function recursively nudges b towards 0, while giving a the same amount to keep the sum the same.
As an additional challenge, let's get rid of the second if block to avoid a conditional jump. This time we'll need to use some bitwise operators:
int add(int a, int b) {
if(!b)
return a;
int gt = (b > 0);
int m = -1 << (gt << 4) << (gt << 4);
return (++a & --b & 0)
| add( (~m & a--) | (m & --a),
(~m & b++) | (m & ++b)
);
}
The function trace is identical; a and b are nudged between each add call just like before.
However, some bitwise magic is employed to drop the if statement while continuing to not use +,-,*:
A mask m is set to 0xFFFFFFFF (-1 in signed decimal) if b is positive, or 0x00000000 if b is negative.
The reason for shifting the mask left by 16 twice instead a single shift left by 32 is because shifting by >= the size of the value is undefined behavior.
The final return takes a bit of thought to fully appreciate:
Consider this technique to avoid a branch when deciding between two values. Of the values, one is multiplied by the boolean while the other is multiplied by the inverse, and the results are summed like so:
double naiveFoodPrice(int ownPetBool) {
if(ownPetBool)
return 23.75;
else
return 10.50;
}
double conditionlessFoodPrice(int ownPetBool) {
double result = ownPetBool*23.75 + (!ownPetBool)*10.50;
}
This technique works great in most cases. For us, the addition operator can easily be substituted for the bitwise or | operator without changing the behavior.
The multiplication operator is also not allowed for this problem. This is the reason for our earlier mask value - a bitwise and & with the mask will achieve the same effect as multiplying by the original boolean.
The nature of the unary increment and decrement operators halts our progress.
Normally, we would easily be able to choose between an a which was incremented by 1 and an a which was decremented by 1.
However, because the increment and decrement operators modify their operand, our conditionless code will end up always performing both operations - meaning that the values of a and b will be tainted before we finish using them.
One way around this is to simply create new variables which each contain the original values of a and b, allowing a clean slate for each operation. I consider this boring, so instead we will adjust a and b in a way that does not affect the rest of the code (++a & --b & 0) in order to make full use of the differences between x++ and ++x.
We can now get both possible values for a and b, as the unary operators modifying the operands' values now works in our favor. Our techniques from earlier help us choose the correct versions of each, and we now have a working add function. :)
Python codes:
(1)
add = lambda a,b : -(-a)-(-b)
use lambda function with '-' operator
(2)
add= lambda a,b : len(list(map(lambda x:x,(i for i in range(-a,b)))))