I implemented the powf(float x, float y) math function. This function is a binary floating point operation. I need to test it for correctness,but the test can't iterate over all floating point. what should I do.
Consider 2 questions:
How do I test binary floating point math functions?
Break FP values into groups:
Largest set: Normal values including +/- values near 1.0 and near the extremes as well as randomly selected ones.
Subnormals
Zeros: +0.0, -0.0
NANs
Use at least combinations of 100,000s+ sample values from the first set (including +/-min, +/-1.0, +/-max), 1,000s from the second set (including +/-min, +/-max) and -0.0, +0.0, -NAN, +NAN.
Additional tests for the function's edge cases.
How do I test powf()?
How: Test powf() against pow() for result correctness.
Values to test against: powf() has many concerns.
*pow(x,y) functions are notoriously difficult to code well. The error in little sub-calculations errors propagate to large final errors.
*pow() includes expected integral results with integral value arguments. E.g. pow(2, y) is expected to be exact for all in range results. pow(10, y) is expected to be within 0.5 unit in the last place for all y in range.
*pow() includes expected integer results with negative x.
There is little need to test every x, y combination. Consider every x < 0, y non-whole number value leads to a NAN.
z = powf(x,y) readily underflows to 0.0. Testing of x, y, values near a result of z == 0 needs some attention.
z = powf(x,y) readily overflows to ∞. Testing of x, y, values near a result of z == FMT_MAX needs more attention as a slight error result in FLT_MAX vs. INF. Since overflow is so rampant with powf(x,y), this reduces the numbers of combinations needed as it is the edge that is important and larger values need light testing.
Related
This is a question about floating point analysis and numerical stability. Say I have two [d x 1] vectors a and x and a scalar b such that a.T # x < b (where # denotes a dot product).
I additionally have a unit [d x 1] vector d. I want to derive the maximum scalar s so that a.T # (x + s * d) < b. Without floating point errors this is trivial:
s = (b - a.T # x) / (a.T # d).
But with floating point errors though this s is not guaranteed to satisfy a.T # (x + s * d) < b.
Currently my solution is to use a stabilized division, which helps:
s = sign(a.T # x) * sign(a.T # d) * exp(log(abs(a.T # x) + eps) - log(abs(a.T # d) + eps)).
But this s still does not always satisfy the inequality. I can check how much this fails by:
diff = a.T # (x + s * d) - b
And then "push" that diff back through: (x + s * d - a.T # (diff + eps2)). Even with both the stable division and pushing the diff back sometimes the solution fails to satisfy the inequality. So these attempts at a solution are both hacky and they do not actually work. I think there is probably some way to do this that would work and be guaranteed to minimally satisfy the inequality under floating point imprecision, but I'm not sure what it is. The solution needs to be very efficient because this operation will be run trillions of times.
Edit: Here is an example in numpy of this issue coming into play, because a commenter had some trouble replicating this problem.
np.random.seed(1)
p, n = 10, 1
k = 3
x = np.random.normal(size=(p, n))
d = np.random.normal(size=(p, n))
d /= np.sum(d, axis=0)
a, b = np.hstack([np.zeros(p - k), np.ones(k)]), 1
s = (b - a.T # x) / (a.T # d)
Running this code gives a case where a.T # (s * d + x) > b failing to satisfy the constraint. Instead we have:
>>> diff = a.T # (x + s * d) - b
>>> diff
array([8.8817842e-16])
The question is about how to avoid this overflow.
The problem you are dealing with appear to be mainly rounding issues and not really numerical stability issues. Indeed, when a floating-point operation is performed, the result has to be rounded so to fit in the standard floating point representation. The IEEE-754 standard specify multiple rounding mode. The default one is typically the rounding to nearest.
This mean (b - a.T # x) / (a.T # d) and a.T # (x + s * d) can be rounded to the previous or nest floating-point value. As a result, there is slight imprecision introduced in the computation. This imprecision is typically 1 unit of least precision (ULP). 1 ULP basically mean a relative error of 1.1e−16 for double-precision numbers.
In practice, every operation can result in rounding and not the whole expression so the error is typically of few ULP. For operation like additions, the rounding tends to mitigate the error while for some others like a subtraction, the error can dramatically increase. In your case, the error seems only to be due to the accumulation of small errors in each operations.
The floating point computing units of processors can be controlled in low-level languages. Numpy also provides a way to find the next/previous floating point value. Based on this, you can round the value up or down for some parts of the expression so for s to be smaller than the target theoretical value. That being said, this is not so easy since some the computed values can be certainly be negative resulting in opposite results. One can round positive and negative values differently but the resulting code will certainly not be efficient in the end.
An alternative solution is to compute the theoretical error bound so to subtract s by this value. That being said, this error is dependent of the computed values and the actual algorithm used for the summation (eg. naive sum, pair-wise, Kahan, etc.). For example the naive algorithm and the pair-wise ones (used by Numpy) are sensitive to the standard deviation of the input values: the higher the std-dev, the bigger the resulting error. This solution only works if you exactly know the distribution of the input values or/and the bounds. Another issues is that it tends to over-estimate the error bounds and gives a just an estimation of the average error.
Another alternative method is to rewrite the expression by replacing s by s+h or s*h and try to find the value of h based on the already computed s and other parameters. This methods is a bit like a predictor-corrector. Note that h may not be precise also due to floating point errors.
With the absolute correction method we get:
h_abs = (b - a # (x + s * d)) / (a # d)
s += h_abs
With the relative correction method we get:
h_rel = (b - a # x) / (a # (s * d))
s *= h_rel
Here are the absolute difference with the two methods:
Initial method: 8.8817842e-16 (8 ULP)
Absolute method: -8.8817842e-16 (8 ULP)
Relative method: -8.8817842e-16 (8 ULP)
I am not sure any of the two methods are guaranteed to fulfil the requirements but a robust method could be to select the smallest s value of the two. At least, results are quite encouraging since the requirement are fulfilled for the two methods with the provided inputs.
A good method to generate more precise results is to use the Decimal package which provide an arbitrary precision at the expense of a much slower execution. This is particularly useful to compare practical results with more precise ones.
Finally, a last solution is to increase/decrease s one by one ULP so to find the best result. Regarding the actual algorithm used for the summation and inputs, results can change. The exact expression used to compute the difference also matter. Moreover, the result is certainly not monotonic because of the way floating-point arithmetic behave. This means one need to increase/decrease s by many ULP so to be able to perform the optimization. This solution is not very efficient (at least, unless big steps are used).
Goal
I want to apply "relative" rounding to the elements of a numpy array. Relative rounding means here that I round to a given number significant figures, whereby I do not care whether this are decimal or binary figures.
Suppose we are given two arrays a and b so that some elements are close to each other. That is,
np.isclose(a, b, tolerance)
has some True entries for a given relative tolerance. Suppose that we know that all entries that are not equal within the tolerance differ by a relative difference of at least 100*tolerance. I want to obtain some arrays a2 and b2 so that
np.all(np.isclose(a, b, tolerance) == (a2 == b2))
My idea is to round the arrays to an appropriate significant digit:
a2 = relative_rounding(a, precision)
b2 = relative_rounding(b, precision)
However, whether the numbers are rounded or floor is applied does not matter as long as the goal is achieved.
An example:
a = np.array([1.234567891234, 2234.56789123, 32.3456789123])
b = np.array([1.234567895678, 2234.56789456, 42.3456789456])
# desired output
a2 = np.array([1.2345679, 2234.5679, 3.2345679])
b2 = np.array([1.2345679, 2234.5679, 4.2345679])
Motivation
The purpose of this exercise is to allow me to work with clearly defined results of binary operations so that little errors do not matter. For example, I want that the result of np.unique is not affected by imprecisions of floating point operations.
You may suppose that the error introduced by the floating point operations is known/can be bounded.
Question
I am aware of similar questions concerning rounding up to given significant figures with numpy and respective solutions. Though the respective answers may be sufficient for my purposes, I think there should be a simpler and more efficient solution to this problem: since floating point numbers have the "relative precision" builtin, it should be possible to just set the n least significant binary values in the mantissa to 0. This should be even more efficient than the usual rounding procedure. However, I do not know how to implement that with numpy. It is essential that the solution is vectorized and more efficient than the naive way. Is there a direct way of directly manipulating the binaries of an array in numpy?
This is impossible, except for special cases such as a precision of zero (isclose becomes equivalent to ==) or infinity (all numbers are close to each other).
numpy.isclose is not transitive. We may have np.isclose(x, y, precision) and np.isclose(y, z, precision) but not np.isclose(x, z, precision). (For example, 10 and 11 are within 10% of each other, and 11 and 12 are within 10% of each other, but 10 and 12 are not within 10% of each other.)
Give the above isclose relations for x, y, and z, the requested property would require that x2 == y2 and y2 == z2 be true but that x2 == z2 be false. However, == is transitive, so x2 == y2 and y2 == z2 implies x2 == z2. Thus, the requested function requires that x2 == z2 be both true and false, and hence it is impossible.
While I am trying to solve this problem in a context where numpy is used heavily (and therefore an elegant numpy-based solution would be particularly welcome) the fundamental problem has nothing to do with numpy (or even Python) as such.
The task is to create an automated test for an algorithm which is supposed to produce points distributed on a grid whose pitch is specified as an input to the algorithm. The absolute positions of the points do not matter, but their relative positions do. For example, following
collection_of_points = algorithm(data, pitch=[1.3, 1.5, 2])
collection_of_points should contain only points whose x-coordinates differ by multiples of 1.3, whose y-coordinates differ by multiples of 1.5 and whose z-coordinates differ by multiples of 2.
The test should verify that this condition is satisfied.
One thing that I have tried, which doesn't seem too ugly, but doesn't work is
points = algo(data, pitch=requested_pitch)
for p1, p2 in itertools.combinations(points, 2):
distance_between_points = np.array(p2) - np.array(p1)
assert np.allclose(distance_between_points % requested_pitch, 0)
[ Aside for those unfamiliar with python or numpy:
itertools.combinations(points, 2) is a simple way of iterating through all pairs of points
Arithmetic operations on np.arrays are performed elementwise, so np.array([5,6,7]) % np.array([2,3,4]) evaluates to np.array([1, 0, 3]) via np.array([5%2, 6%3, 7%4])
np.allclose checks whether all corresponding elements in the two inputs arrays are approximately equal, and numpy automatically pretends that the 0 which is passed in as the second argument, was really an all-zero array of the correct size
]
To see why the idea shown above fails, consider a desired pitch of 3 and two points which are separated by 8.9999999 in the relevant dimension. 8.999999 % 3 is around 2.999999 which is nowhere near the required 0.
In all of this, I can't help feeling that I'm missing something obvious or that I'm re-inventing some wheel.
Can you suggest an elegant way of writing such a check?
Change your assertion to:
np.all(np.logical_or(np.isclose(x % y, 0), np.isclose((x % y) - y, 0)))
If you want to make it more readable, you should functionalize the statement. Something like:
def is_multiple(x, y, rtol=1e-05, atol=1e-08):
"""
Test if x is a multiple of y.
"""
remainder = x % y
is_zero = np.isclose(remainder, 0., rtol, atol)
is_y = np.isclose(remainder, y, rtol, atol)
return np.logical_or(is_zero, is_y)
And then:
assert np.all(is_multiple(distance_between_points, requested_pitch))
I am converting a FORTRAN code that maintains at least 16 decimal precision.
I am facing a problem of dividing by zero in VBA excel, but If I try the code below on this online compiler, I do not get a zero.
Any help is appreciated. Thanks in advance.
This is the fortran Code
program sum
IMPLICIT DOUBLE PRECISION (A-H,O-Z)
x = 3.14159265358979
y = 1.24325643454325
z = (x*y)/dtan(0.0D0)
print *, datan(.04D0*z)
end program sum
This is the VBA Code
Public Function dosomething()
Dim X As Double
Dim Y As Double
Dim Z As Double
X = 3.14159265358979
Y = 1.24325643454325
Z = (X * Y) / Tan(0#)
End Function
Without going into too many details, Fortran can support real values (floats) of +/-Infinity and NaN, depending on how the value is calculated. For example, your original post contained two uninitialized variables which you then used to calculate (v1 * v2)/dtan(0.0d0). Since uninitialized vars are often (but not always) set to 0, this calculation becomes 0.0/0.0, which is mathematically undefined and the result is NaN.1
Now, if the numerator is positive, z=(x*y)/dtan(0.0D0) results in z=Infinity, regardless of what x and y are. If your system cannnot represent Infinity, then it uses "a very large number". That is evidently the case with VBA.
Finally, you calculate datan(.04D0*z). Mathematically, this is arctangent(Infinity)=PI/2. And again, the correctly computed Fortran results match this, returning a double-precision value of 1.57079632679490.2
Now, I don't know much about VBA, but it does not seem to support +/-Infinity or NaN. If a "very large number" results in significant error compared to what you are expecting in your final result, then it appears there are workarounds as described at this SO question.
1 Note that in Fortran with double precision you should get dtan(0.0d0) = 0.000000000000000E+000.
2
In order to maintain double-precision in the Fortran x and y variables, you must append d0. Otherwise they will become single-precision values by default and store only the first 7 sig figs from your original assignment, and it's up to the compiler what the remaining digits in the double-precision value take (usually just garbage).
Z = (X * Y) / Tan(0#)
The type hint on the 0 literal is superfluous, Tan function takes a Double and returns a Double. But Tan(0) returns 0, so you are dividing by 0.
Seems your online Fortran compiler is doing something funky.
it should not be zero tho, it should be tan(1E-16)
No. That's mathematically wrong, VBA is doing it right. If you need your VBA code to be just as broken as that Fortran, then you need to handle the situation explicitly:
Z = (X * Y) / Tan(1E-16)
But just know that is mathematically wrong. I've not idea how the Fortran code manages to output 1.5707963267948966. This VBA code outputs 3.90580528128931E+16.
edit
So based on the answers so far (thanks for taking your time) I'm getting the sense that I'm probably NOT looking for a Normal Distribution function. Perhaps I'll try to re-describe what I'm looking to do.
Lets say I have an object that returns a number of 0 to 10. And that number controls "speed". However instead of 10 being the top speed, I need 5 to be the top speed, and anything lower or higher would slow down accordingly. (with easing, thus the bell curve)
I hope that's clearer ;/
-original question
These are the times I wish I remembered something from math class.
I'm trying to figure out how to write a function in obj-C where I define the boundries, ex (0 - 10) and then if x = foo y = ? .... where x runs something like 0,1,2,3,4,5,6,7,8,9,10 and y runs 0,1,2,3,4,5,4,3,2,1,0 but only on a curve
Something like the attached image.
I tried googling for Normal Distribution but its way over my head. I was hoping to find some site that lists some useful algorithms like these but wasn't very successful.
So can anyone help me out here ? And if there is some good sites which shows useful mathematical functions, I'd love to check them out.
TIA!!!
-added
I'm not looking for a random number, I'm looking for.. ex: if x=0 y should be 0, if x=5 y should be 5, if x=10 y should be 0.... and all those other not so obvious in between numbers
alt text http://dizy.cc/slider.gif
Okay, your edit really clarifies things. You're not looking for anything to do with the normal distribution, just a nice smooth little ramp function. The one Paul provides will do nicely, but is tricky to modify for other values. It can be made a little more flexible (my code examples are in Python, which should be very easy to translate to any other language):
def quarticRamp(x, b=10, peak=5):
if not 0 <= x <= b:
raise ValueError #or return 0
return peak*x*x*(x-b)*(x-b)*16/(b*b*b*b)
Parameter b is the upper bound for the region you want to have a slope on (10, in your example), and peak is how high you want it to go (5, in the example).
Personally I like a quadratic spline approach, which is marginally cheaper computationally and has a different curve to it (this curve is really nice to use in a couple of special applications that don't happen to matter at all for you):
def quadraticSplineRamp(x, a=0, b=10, peak=5):
if not a <= x <= b:
raise ValueError #or return 0
if x > (b+a)/2:
x = a + b - x
z = 2*(x-a)/b
if z > 0.5:
return peak * (1 - 2*(z-1)*(z-1))
else:
return peak * (2*z*z)
This is similar to the other function, but takes a lower bound a (0 in your example). The logic is a little more complex because it's a somewhat-optimized implementation of a piecewise function.
The two curves have slightly different shapes; you probably don't care what the exact shape is, and so could pick either. There are an infinite number of ramp functions meeting your criteria; these are two simple ones, but they can get as baroque as you want.
The thing you want to plot is the probability density function (pdf) of the normal distribution. You can find it on the mighty Wikipedia.
Luckily, the pdf for a normal distribution is not difficult to implement - some of the other related functions are considerably worse because they require the error function.
To get a plot like you showed, you want a mean of 5 and a standard deviation of about 1.5. The median is obviously the centre, and figuring out an appropriate standard deviation given the left & right boundaries isn't particularly difficult.
A function to calculate the y value of the pdf given the x coordinate, standard deviation and mean might look something like:
double normal_pdf(double x, double mean, double std_dev) {
return( 1.0/(sqrt(2*PI)*std_dev) *
exp(-(x-mean)*(x-mean)/(2*std_dev*std_dev)) );
}
A normal distribution is never equal to 0.
Please make sure that what you want to plot is indeed a
normal distribution.
If you're only looking for this bell shape (with the tangent and everything)
you can use the following formula:
x^2*(x-10)^2 for x between 0 and 10
0 elsewhere
(Divide by 125 if you need to have your peek on 5.)
double bell(double x) {
if ((x < 10) && (x>0))
return x*x*(x-10.)*(x-10.)/125.;
else
return 0.;
}
Well, there's good old Wikipedia, of course. And Mathworld.
What you want is a random number generator for "generating normally distributed random deviates". Since Objective C can call regular C libraries, you either need a C-callable library like the GNU Scientific Library, or for this, you can write it yourself following the description here.
Try simulating rolls of dice by generating random numbers between 1 and 6. If you add up the rolls from 5 independent dice rolls, you'll get a surprisingly good approximation to the normal distribution. You can roll more dice if you'd like and you'll get a better approximation.
Here's an article that explains why this works. It's probably more mathematical detail than you want, but you could show it to someone to justify your approach.
If what you want is the value of the probability density function, p(x), of a normal (Gaussian) distribution of mean mu and standard deviation sigma at x, the formula is
p(x) = exp( ((x-mu)^2)/(2*sigma^2) ) / (sigma * 2 * sqrt(pi))
where pi is the area of a circle divided by the square of its radius (approximately 3.14159...). Using the C standard library math.h, this is:
#include <math>
double normal_pdf(double x, double mu, double sigma) {
double n = sigma * 2 * sqrt(M_PI); //normalization factor
p = exp( -pow(x-mu, 2) / (2 * pow(sigma, 2)) ); // unnormalized pdf
return p / n;
}
Of course, you can do the same in Objective-C.
For reference, see the Wikipedia or MathWorld articles.
It sounds like you want to write a function that yields a curve of a specific shape. Something like y = f(x), for x in [0:10]. You have a constraint on the max value of y, and a general idea of what you want the curve to look like (somewhat bell-shaped, y=0 at the edges of the x range, y=5 when x=5). So roughly, you would call your function iteratively with the x range, with a step that gives you enough points to make your curve look nice.
So you really don't need random numbers, and this has nothing to do with probability unless you want it to (as in, you want your curve to look like a the outline of a normal distribution or something along those lines).
If you have a clear idea of what function will yield your desired curve, the code is trivial - a function to compute f(x) and a for loop to call it the desired number of times for the desired values of x. Plot the x,y pairs and you're done. So that's your algorithm - call a function in a for loop.
The contents of the routine implementing the function will depend on the specifics of what you want the curve to look like. If you need help on functions that might return a curve resembling your sample, I would direct you to the reading material in the other answers. :) However, I suspect that this is actually an assignment of some sort, and that you have been given a function already. If you are actually doing this on your own to learn, then I again echo the other reading suggestions.
y=-1*abs(x-5)+5