When I decrease the value of a coefficient in my code something stops working. Can I have a division by zero without an error message? Can this be solved by increasing the number of significant digits?
How can I increase the number of significant digits in numpy? Thank you
Numpy does not support arbitrary precision. see here. The scalar types they have are these.
Consider using fractions module or other library w arbitrary precision...
Related
at the risk of using the wrong SE...
My question concerns the rcond parameter in numpy.linalg.lstsq(a, b, rcond). I know it's used to define the cutoff for small singular values in the singular value decomposition when numpy computed the pseudo-inverse of a.
But why is it advantageous to set "small" singular values (values below the cutoff) to zero rather than just keeping them as small numbers?
PS: Admittedly, I don't know exactly how the pseudo-inverse is computed and exactly what role SVD plays in that.
Thanks!
To determine the rank of a system, you could need compare against zero, but as always with floating point arithmetic we should compare against some finite range. Hitting exactly 0 never happens when adding and subtracting messy numbers.
The (reciprocal) condition number allows you to specify such a limit in a way that is independent of units and scale in your matrix.
You may also opt of of this feature in lstsq by specifying rcond=None if you know for sure your problem can't run the risk of being under-determined.
The future default sounds like a wise choice from numpy
FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions
because if you hit this limit, you are just fitting against rounding errors of floating point arithmetic instead of the "true" mathematical system that you hope to model.
First, sorry if this question looks "silly", because I'm new to MPFR, LOL.
I have two mpfr_t variables with precision of 1024, and they have the value of 0.2 and 0.06 stored in them.
But when I add these variables, things goes wrong and the result (which is also a mpfr_t variable) has the value of 0.2599999...
This is strange because the MPFR library should maintain the precision (isn't it?).
Could you please help me with this? Thanks so much, so much in advance.
MPFR numbers are represented in binary (base 2). In this system, the only numbers that can be represented exactly have the form N·2k, where N and k are integers. Neither 0.2 = 1/5 nor 0.06 = 3/50 have this form, so that they are approximated with some small error. When you add these variables, you are seeing a consequence of this error (there may be also another error in the addition operation since in binary these numbers have many nonzero digits, unlike in decimal).
This is the same issue as the one described in: Is floating point math broken?
EDIT:
To answer the question in comment "Is there a way to avoid this situation?", no, there is no way to avoid this situation in practice, except in very specific cases. For instance, if all your numbers (inputs and results of each intermediate operations) are decimal numbers, representable with a small enough number of digits, you can use a decimal arithmetic (but MPFR can't do that). Computer algebra systems may help in some cases. There's also iRRAM... I'll come back to it later.
However, there are solutions to attempt to hide issues with numerical errors. You need to estimate the maximum possible error on a computed value. With an error analysis, you can obtain rigorous bounds, but this may be difficult or take time to do. Note that rigorous bounds are pessimistic in general, but if you use arbitrary precision (e.g. with MPFR), this is less an issue. The analysis can be done dynamically with interval arithmetic (still pessimistic, even worse). But perhaps a simple estimate is sufficient for you. Once you have an estimate of the maximum error:
For the output, choose the number of displayed digits so that the error is less than the weight of the last displayed digit.
For discontinuous functions (e.g. equality test, floor, ceil): if the distance between the computed value and a discontinuity point is less than the maximum error, assume that the actual value is equal to the discontinuity point. Note that this is just a heuristic, but if it fails (this may remain unnoticed and will probably invalidate your estimate), this means that you have not done your computations with enough precision.
Note: MPFR won't do that for you. But you can write code to take these rules into account.
The iRRAM package, which is based on MPFR, can track the error in a rigorous way (like with interval arithmetic) and automatically redo all the computations in a higher precision if it notices that the accuracy is too low. However, if some mathematical result is a discontinuity point, iRRAM won't help. In particular, it cannot provide a rigorous equality test.
Finally, I suggest that you have a look at Goldberg's paper What Every Computer Scientist Should Know About Floating-Point Arithmetic, in particular the notion of cancellation.
For a university course in numerical analysis we are transitioning from Maple to a combination of Numpy and Sympy for various illustrations of the course material. This is because the students already learn Python the semester before.
One of the difficulties we have is in emulating fixed precision in Python. Maple allows the user to specify a decimal precision (say 10 or 20 digits) and from then on every calculation is made with that precision so you can see the effect of rounding errors. In Python we tried some ways to achieve this:
Sympy has a rounding function to a specified number of digits.
Mpmath supports custom precision.
This is however not what we're looking for. These options calculate the exact result and round the exact result to the specified number of digits. We are looking for a solution that does every intermediate calculation in the specified precision. Something that can show, for example, the rounding errors that can happen when dividing two very small numbers.
The best solution so far seems to be the custom data types in Numpy. Using float16, float32 and float64 we were able to al least give an indication of what could go wrong. The problem here is that we always need to use arrays of one element and that we are limited to these three data types.
Does anything more flexible exist for our purpose? Or is the very thing we're looking for hidden somewhere in the mpmath documentation? Of course there are workarounds by wrapping every element of a calculation in a rounding function but this obscures the code to the students.
You can use decimal. There are several ways of usage, for example, localcontext or getcontext.
Example with getcontext from documentation:
>>> from decimal import *
>>> getcontext().prec = 6
>>> Decimal(1) / Decimal(7)
Decimal('0.142857')
Example of localcontext usage:
>>> from decimal import Decimal, localcontext
>>> with localcontext() as ctx:
... ctx.prec = 4
... print Decimal(1) / Decimal(3)
...
0.3333
To reduce typing you can abbreviate the constructor (example from documentation):
>>> D = decimal.Decimal
>>> D('1.23') + D('3.45')
Decimal('4.68')
I have been getting seemingly unacceptably high inaccuracies when computing matrix inverses (solving a linear system) in numpy.
Is this a normal level of inaccuracy?
How can I improve the accuracy of this computation?
Also, is there a way to solve this system more efficiently in numpy or scipy (scipy.linalg.cho_solve seemed promising but does not do what I want)?
In the code below, cholM is a 128 x 128 matrix. The matrix data is too large to include here but is located on pastebin: cholM.txt.
Also, the original vector, ovec, is being randomly selected, so for different ovec's the accuracy varies, but, for most cases, the error still seems unacceptably high.
Edit Solving the system using the singular value decomposition produces significantly lower error than the other methods.
import numpy.random as rnd
import numpy.linalg as lin
import numpy as np
cholM=np.loadtxt('cholM.txt')
dims=len(cholM)
print 'Dimensions',dims
ovec=rnd.normal(size=dims)
rvec=np.dot(cholM.T,ovec)
invCholM=lin.inv(cholM.T)
svec=np.dot(invCholM,rvec)
svec1=lin.solve(cholM.T,rvec)
def back_substitute(M,v):
r=np.zeros(len(v))
k=len(v)-1
r[k]=v[k]/M[k,k]
for k in xrange(len(v)-2,-1,-1):
r[k]=(v[k]-np.dot(M[k,k+1:],r[k+1:]))/M[k,k]
return r
svec2=back_substitute(cholM.T,rvec)
u,s,v=lin.svd(cholM)
svec3=np.dot(u,np.dot(np.diag(1./s),np.dot(v,rvec)))
for k in xrange(dims):
print '%20.3f%20.3f%20.3f%20.3f'%(ovec[k]-svec[k],ovec[k]-svec1[k],ovec[k]-svec2[k],ovec[k]-svec3[k])
assert np.all( np.abs(ovec-svec)<1e-5 )
assert np.all( np.abs(ovec-svec1)<1e-5 )
As noted by #Craig J Copi and #pv, the condition number of the cholM matrix is high, around 10^16, indicating that to achieve higher accuracy in the inverse, much greater numerical precision may be required.
Condition number can be determined by the ratio of maximum singular value to minimum singular value. In this instance, this ratio is not the same as the ratio of eigenvalues.
http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html
We could find the solution vector using a matrix inverse:
...
However, it is better to use the linalg.solve command which can be faster and more numerically stable
edit - from Steve Lord at MATLAB
http://www.mathworks.com/matlabcentral/newsreader/view_thread/63130
Why are you inverting? If you're inverting to solve a system, don't --
generally you would want to use backslash instead.
However, for a system with a condition number around 1e17 (condition numbers
must be greater than or equal to 1, so I assume that the 1e-17 figure in
your post is the reciprocal condition number from RCOND) you're not going to
get a very accurate result in any case.
How can I do accurate decimal number arithmetic since using floats is not reliable?
I still want to return the answer to a textField.
You can either use int-s (e.g. with "cents" as a unit instead of "dollars"), or use NSDecimalNumber s.
Store your number multiplied by some power of ten of your choice, chosen by the amount of precision you need to the right of the decimal point. We'll call these scaled numbers. Convert entered values that need your precision into scaled numbers. Addition is easy: just add. Multiplication is slightly harder: just multiply two scaled numbers, and then divide by your scale factor. Division is easy: just divide, then multiply by the scale factor. The only inconvenience is you'll have to write your own numeric input/output conversion routines.