arccos((sin(x)^2)+cos(x)^2) is not 0 for certain angles - numeric

I am struggling with a simple calculation problem.
By theory for an arbitrary angle:
If
is
If I implement this in python or Matlab:
import numpy as np
alpha = -89.999961
alpha_rad = np.deg2rad(alpha)
result = np.arccos(np.sin(alpha_rad)**2 + np.cos(alpha_rad)**2)
print('%.16f' % result)
leads to
0.0000000149011612
whereas
alpha = -89.9999601
leads to
0.0000000000000000
It is also practically 0 using -89.999962°, but it is again 1.49011612e-08 for alpha = -89.9999°
Does anybody know the reason for this and which angles will lead to results bigger than 0. I am not an big expert in numerical mathematics, but the spacing of floating numbers is much smaller (2.220446049250313e-16). I want to multiply the result with a large number so it would be great, if the result is 0 in terms of floating numbers spacing.
Any help and explanation is very welcome!

It's the same in Java too and in other programming languages. The fractional part of a floating point number is finite so there are computational errors in resolving certain values. So in some cases you will need to round to get the expected result.
Here is a Java example of the same issue.
double[] angles = { 23.4, 22, 78.3, 92.4
};
for (double a : angles) {
double val = Math.sin(a) * Math.sin(a) + Math.cos(a) * Math.cos(a);
System.out.println(Math.acos(val) + " " + a);
System.out.println("-------------------------------");
}
}
If you search for subjects like dealing with floating point errors you may get some insight into this as well as how to handle it.

solves the problem for arbitrary angles. Cosines is known for numerical problems (e.g. https://www.nayuki.io/page/numerically-stable-law-of-cosines)

Related

numpy.cov or numpy.linalg.eigvals gives wrong results

I have high (100) dimensional data. I want to get the eigenvectors of the covariance matrix of the data.
Cov = numpy.cov(data)
EVs = numpy.linalg.eigvals(Cov)
I get a vector containing some eigenvalues which are complex numbers. This is mathematically impossible. Granted, the imaginary parts of the complex numbers are very small but it still causes issues later on. Is this a numerical issue? If so, does the issue lie with cov, eigvals function or both?
To give more color on that, I did the same calculation in Mathematica which gives, of course, a correct result. Turns out there are some eigenvalues which are very close to zero but not quiet zero and numpy gets all of these wrong (magnitude wise and it makes some of them into complex numbers)
I was facing a similar issue: np.linalg.eigvals was returning a complex vector in which the imaginary part was quasi-zero everywhere.
Using np.linalg.eigvalsh instead fixed it for me.
I don't know the exact reason, but most probably it is a numerical issue and eigvalsh seems to handle it whereas eigvals doesn't. Note that the ordering of the actual eigenvalues may differ.
The following snippet illustrates the fix:
import numpy as np
from numpy.linalg import eigvalsh, eigvals
D = 10
MUL = 100
EPS = 1e-8
x = np.random.rand(1, D) * MUL
x -= x.mean()
S = np.matmul(x.T, x) + I
# adding epsilon*I avoids negative eigenvalues due to numerical error
# since the matrix is actually positive semidef. (useful for cholesky etc)
S += np.eye(D, dtype=np.float64) * EPS
print(sorted(eigvalsh(S)))
print(sorted(eigvals(S)))

np.fft.fft off by a factor of 1000 (fitting an powerspectrum)

I'm trying to make a powerspectrum from an experimental dataset which I am reading in, and then to fit it to an theoretical curve. Now everything is working fine and I'm not getting errors, except for the fact that my curve keeps differing by a factor of 1000 from the data and I have absolutely no idea what the problem could be. I've asked a few people, but to no avail. (I hope that you guys will be able to help)
Anyways, I'm pretty sure that its not the units, as they were tripple checked by me and 2 others. Basically, I need to fit a powerspectrum to an equation by using the least squares method.
I can't post the whole code, as its rather long and a bit messy, but this is the fourier part, I added comments to all arrays and vars which have not been declared in the code)
#Calculate stuff
Nm = 10**-6 #micro to meter
KbT = 4.10E-21 #Joule
T = 297. #K
l = zvalue*Nm #meter
meany = np.mean(cleandatay*Nm) #meter (cleandata is the array that I read in from a cvs at the start.)
SDy = sum((cleandatay*Nm - meany)**2)/len(cleandatay) #meter^2
FmArray[0][i] = ((KbT*l)/SDy) #N
#print FmArray[0][i]
print float((i*100/len(filelist)))#how many % done?
#fourier
dt = cleant[1]-cleant[0] #timestep
N = len(cleandatay) #Same for cleant, its the corresponding time to cleandatay
Here is where the fourier part starts, I take the fft and turn it into a powerspectrum. Then I calculate the corresponding freq steps with the array freqs
fouriery = np.fft.fft((cleandatay*(10**-6)))
fourierpower = (np.abs(fouriery))**2
fourierpower = fourierpower[1:N/2] #remove 0th datapoint and /2 (remove negative freqs)
fourierpower = fourierpower*dt #*dt to account for steps
freqs = (1.+np.arange((N/2)-1.))/50.
#Least squares method
eta = 8.9E-4 #pa*s
Rbead = 0.5E-6#meter
constant = 2*KbT/(3*eta*pi*Rbead)
omega = 2*pi*freqs #rad/s
Wcarray = 2.*pi*np.arange(0,30, 0.02003) #0.02 = 30/len(freqs)
ChiSq = np.zeros(len(Wcarray))
for k in range(0, len(Wcarray)):
Py = (constant / (Wcarray[k]**2 + omega**2))
ChiSq[k] = sum((fourierpower - Py)**2)
pylab.loglog(omega, Py)
print k*100/len(Wcarray)
index = np.where(ChiSq == min(ChiSq))
cutoffw = Wcarray[index]
Pygoed = (constant / (Wcarray[index]**2 + omega**2))
print cutoffw
print constant
print min(ChiSq)
pylab.loglog(omega,ChiSq)
So I have no idea what could be going wrong, I think its the fft, as nothing else can really go wrong.
Below is the pic I get when I plot all the fit lines against the spectrum, as you can see it is off by about 1000 (actually exactly 1000, as this leaves a least square residue of 10^-22, but I can't just randomly multiply without knowing why)
Just to elaborate on the picture. The green dots are the fft spectrum, the lines are the fits, the red dot is where it thinks the cutoff frequency is, and the blue line is the chi-squared fit, looking for the lowest value.
Take a look at the documentation for the FFT that you are using. Many FFTs introduce a scaling factor that is usually N * result (number of samples). Multiplying by 1/N will scale the results back in line. (You said that the result is 1000 too high....could it be that you are using a 1024 size FFT?)
Your library FFT routine might include a scale factor of 1/sqrt(n).
Check the documentation for the fft you used, as the proportion of the scale factor allocated between the fft and the ifft is arbitrary.

What causes a nan error result?

I have this chunk of code which runs through a loop. The lower case x var always prints correctly. The upper case X var sometimes prints correctly, sometimes prints nan or junk. Why?
N.B. The data is always identical.
Link to FFT
Link to FFT example usage
Link to my other SO question which shows how this is being used. BOUNTY OF 200 points!
double (*x)[2];
double (*X)[2];
x = malloc(2 * 512 * sizeof(double));
X = malloc(2 * 512 * sizeof(double));
for (j = 0; j < 10; j++){
(*x)[j] = // values inserted from method argument.;
}
fft(512, x, X);
for (j = 0; j < 512; j++){
if (i==512*20) {
NSLog(#"PRE POST %f - %f",(*x)[j], (*X)[j]);
}
}
free(x);
free(X);
In floating point arithmetic, there are several operations that will result in a NaN error. Wikipedia points out these operations as resulting in a NaN:
The divisions 0/0 and ±∞/±∞
The multiplications 0×±∞ and ±∞×0
The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions
(These are called indeterminate forms.)
Check your code to see if you're performing any operations that can't have a numeric answer.
As for the 'junk' results, they may be the result of messed up memory allocation, but you haven't given much detail so I can't be sure.
In other languages which I have worked in, "NaN" (not a number) is what you get when you divide a 0.0 by 0.0. I don't know Objective-C, but it's likely the same.
As for what is causing nan to be stored in X... you'll have to show us the body of fft before anyone can answer that. You said you think it might be a memory/pointer bug, because it's not consistent. I just looked up how NaN is represented in the IEE 7754 floating point format (which your platform is likely using) -- basically, several of the high-order bits (which normally hold the exponent of a floating-point number) all have to be filled with 1s.
If you do have a memory corruption bug, which is causing junk to be stored into X, then if those particular bits happened to all be 1s, that would cause the number to print as "nan".
Again, please show the body of fft, so someone can try to help you further.
I tried running this - initialised the data using (*x)[j] = j and removed the i==512*20 printing condition. All values came back fine. I also tried with random input data - still good. What is the nature of your input data?
(I'll look at your other question as well)
edit: I should point out I filled 512 values of the x array - your loop above only fills 10, so much of the input array is uninitialised.

sqrtf(6)*sqrtf(6) != 6 in xcode

I'm trying to create a software that shows the angle between two vectors and it's not working when then are equal to (1,1,2), hence the modulus of this vector is sqrtf(6) which is rouding to 2.449490 and it should be 2.44948974278318.
Is there a way to increase precision of this operation?
In the next steps of my software I make this operation:
float angle = acos(dot/(modulus1*modulus2));
If modulus1 == modulus 2, then modulus1*modulus2 = dot, but it's not happening with some values.
I hope I made myself clear.
Thanks in advance,
Gruber
You can use double if you want greater precision. However, note that the == operation on floating point numbers never work the way they do with integral types. Use an epsilon to adjust for minor differences.

Normal Distribution function

edit
So based on the answers so far (thanks for taking your time) I'm getting the sense that I'm probably NOT looking for a Normal Distribution function. Perhaps I'll try to re-describe what I'm looking to do.
Lets say I have an object that returns a number of 0 to 10. And that number controls "speed". However instead of 10 being the top speed, I need 5 to be the top speed, and anything lower or higher would slow down accordingly. (with easing, thus the bell curve)
I hope that's clearer ;/
-original question
These are the times I wish I remembered something from math class.
I'm trying to figure out how to write a function in obj-C where I define the boundries, ex (0 - 10) and then if x = foo y = ? .... where x runs something like 0,1,2,3,4,5,6,7,8,9,10 and y runs 0,1,2,3,4,5,4,3,2,1,0 but only on a curve
Something like the attached image.
I tried googling for Normal Distribution but its way over my head. I was hoping to find some site that lists some useful algorithms like these but wasn't very successful.
So can anyone help me out here ? And if there is some good sites which shows useful mathematical functions, I'd love to check them out.
TIA!!!
-added
I'm not looking for a random number, I'm looking for.. ex: if x=0 y should be 0, if x=5 y should be 5, if x=10 y should be 0.... and all those other not so obvious in between numbers
alt text http://dizy.cc/slider.gif
Okay, your edit really clarifies things. You're not looking for anything to do with the normal distribution, just a nice smooth little ramp function. The one Paul provides will do nicely, but is tricky to modify for other values. It can be made a little more flexible (my code examples are in Python, which should be very easy to translate to any other language):
def quarticRamp(x, b=10, peak=5):
if not 0 <= x <= b:
raise ValueError #or return 0
return peak*x*x*(x-b)*(x-b)*16/(b*b*b*b)
Parameter b is the upper bound for the region you want to have a slope on (10, in your example), and peak is how high you want it to go (5, in the example).
Personally I like a quadratic spline approach, which is marginally cheaper computationally and has a different curve to it (this curve is really nice to use in a couple of special applications that don't happen to matter at all for you):
def quadraticSplineRamp(x, a=0, b=10, peak=5):
if not a <= x <= b:
raise ValueError #or return 0
if x > (b+a)/2:
x = a + b - x
z = 2*(x-a)/b
if z > 0.5:
return peak * (1 - 2*(z-1)*(z-1))
else:
return peak * (2*z*z)
This is similar to the other function, but takes a lower bound a (0 in your example). The logic is a little more complex because it's a somewhat-optimized implementation of a piecewise function.
The two curves have slightly different shapes; you probably don't care what the exact shape is, and so could pick either. There are an infinite number of ramp functions meeting your criteria; these are two simple ones, but they can get as baroque as you want.
The thing you want to plot is the probability density function (pdf) of the normal distribution. You can find it on the mighty Wikipedia.
Luckily, the pdf for a normal distribution is not difficult to implement - some of the other related functions are considerably worse because they require the error function.
To get a plot like you showed, you want a mean of 5 and a standard deviation of about 1.5. The median is obviously the centre, and figuring out an appropriate standard deviation given the left & right boundaries isn't particularly difficult.
A function to calculate the y value of the pdf given the x coordinate, standard deviation and mean might look something like:
double normal_pdf(double x, double mean, double std_dev) {
return( 1.0/(sqrt(2*PI)*std_dev) *
exp(-(x-mean)*(x-mean)/(2*std_dev*std_dev)) );
}
A normal distribution is never equal to 0.
Please make sure that what you want to plot is indeed a
normal distribution.
If you're only looking for this bell shape (with the tangent and everything)
you can use the following formula:
x^2*(x-10)^2 for x between 0 and 10
0 elsewhere
(Divide by 125 if you need to have your peek on 5.)
double bell(double x) {
if ((x < 10) && (x>0))
return x*x*(x-10.)*(x-10.)/125.;
else
return 0.;
}
Well, there's good old Wikipedia, of course. And Mathworld.
What you want is a random number generator for "generating normally distributed random deviates". Since Objective C can call regular C libraries, you either need a C-callable library like the GNU Scientific Library, or for this, you can write it yourself following the description here.
Try simulating rolls of dice by generating random numbers between 1 and 6. If you add up the rolls from 5 independent dice rolls, you'll get a surprisingly good approximation to the normal distribution. You can roll more dice if you'd like and you'll get a better approximation.
Here's an article that explains why this works. It's probably more mathematical detail than you want, but you could show it to someone to justify your approach.
If what you want is the value of the probability density function, p(x), of a normal (Gaussian) distribution of mean mu and standard deviation sigma at x, the formula is
p(x) = exp( ((x-mu)^2)/(2*sigma^2) ) / (sigma * 2 * sqrt(pi))
where pi is the area of a circle divided by the square of its radius (approximately 3.14159...). Using the C standard library math.h, this is:
#include <math>
double normal_pdf(double x, double mu, double sigma) {
double n = sigma * 2 * sqrt(M_PI); //normalization factor
p = exp( -pow(x-mu, 2) / (2 * pow(sigma, 2)) ); // unnormalized pdf
return p / n;
}
Of course, you can do the same in Objective-C.
For reference, see the Wikipedia or MathWorld articles.
It sounds like you want to write a function that yields a curve of a specific shape. Something like y = f(x), for x in [0:10]. You have a constraint on the max value of y, and a general idea of what you want the curve to look like (somewhat bell-shaped, y=0 at the edges of the x range, y=5 when x=5). So roughly, you would call your function iteratively with the x range, with a step that gives you enough points to make your curve look nice.
So you really don't need random numbers, and this has nothing to do with probability unless you want it to (as in, you want your curve to look like a the outline of a normal distribution or something along those lines).
If you have a clear idea of what function will yield your desired curve, the code is trivial - a function to compute f(x) and a for loop to call it the desired number of times for the desired values of x. Plot the x,y pairs and you're done. So that's your algorithm - call a function in a for loop.
The contents of the routine implementing the function will depend on the specifics of what you want the curve to look like. If you need help on functions that might return a curve resembling your sample, I would direct you to the reading material in the other answers. :) However, I suspect that this is actually an assignment of some sort, and that you have been given a function already. If you are actually doing this on your own to learn, then I again echo the other reading suggestions.
y=-1*abs(x-5)+5