Pseudo-inverse via signular value decomposition in numpy.linalg.lstsq - numpy

at the risk of using the wrong SE...
My question concerns the rcond parameter in numpy.linalg.lstsq(a, b, rcond). I know it's used to define the cutoff for small singular values in the singular value decomposition when numpy computed the pseudo-inverse of a.
But why is it advantageous to set "small" singular values (values below the cutoff) to zero rather than just keeping them as small numbers?
PS: Admittedly, I don't know exactly how the pseudo-inverse is computed and exactly what role SVD plays in that.
Thanks!

To determine the rank of a system, you could need compare against zero, but as always with floating point arithmetic we should compare against some finite range. Hitting exactly 0 never happens when adding and subtracting messy numbers.
The (reciprocal) condition number allows you to specify such a limit in a way that is independent of units and scale in your matrix.
You may also opt of of this feature in lstsq by specifying rcond=None if you know for sure your problem can't run the risk of being under-determined.
The future default sounds like a wise choice from numpy
FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions
because if you hit this limit, you are just fitting against rounding errors of floating point arithmetic instead of the "true" mathematical system that you hope to model.

Related

Can't correctly factorize a polinomial whose coefficients have decimals (non-integer), how can maxima do it?

how are you?.
I'm trying to create a function that calculates the z-transform of a transfer function using the residues method but for that, I need the factors of the characteristic equation and the powers of the factors, so, in order to do that I tried to factorize polynomials with non-integer coefficients but after trying everything that I read I couldn't factorize make maxima to factorize those polynomials the way I need it.
For giving an example, I have this characteristic equation: "s·(s^2+0.1·s)", the factors should be "s^2" and "s + 0.1" but maxima allways gives me "(s^2·(10·s + 1))/10".
Why I'm signalling this?, well, as I learned that maxima treates the outputs equation as list so I can have its dimension and separate the factos by its positions in the list to measure the powers of the factors and do what I need, but like maxima gives me the result that is shown above then the dimension of the list is different and it will make my function to work differently and possibly have errors.
The result that is shown is given by maxima no matter if I use factor, gfactor, or expand or whatever other function that I found and I know that result is because maxima are rationalizing the polynomial before working with it but I don't need that behavior, I only need the pure factors, so, how can I have the result that I want?.
Thanks in advance for the help.

MPFR - Loss precision after addition

First, sorry if this question looks "silly", because I'm new to MPFR, LOL.
I have two mpfr_t variables with precision of 1024, and they have the value of 0.2 and 0.06 stored in them.
But when I add these variables, things goes wrong and the result (which is also a mpfr_t variable) has the value of 0.2599999...
This is strange because the MPFR library should maintain the precision (isn't it?).
Could you please help me with this? Thanks so much, so much in advance.
MPFR numbers are represented in binary (base 2). In this system, the only numbers that can be represented exactly have the form N·2k, where N and k are integers. Neither 0.2 = 1/5 nor 0.06 = 3/50 have this form, so that they are approximated with some small error. When you add these variables, you are seeing a consequence of this error (there may be also another error in the addition operation since in binary these numbers have many nonzero digits, unlike in decimal).
This is the same issue as the one described in: Is floating point math broken?
EDIT:
To answer the question in comment "Is there a way to avoid this situation?", no, there is no way to avoid this situation in practice, except in very specific cases. For instance, if all your numbers (inputs and results of each intermediate operations) are decimal numbers, representable with a small enough number of digits, you can use a decimal arithmetic (but MPFR can't do that). Computer algebra systems may help in some cases. There's also iRRAM... I'll come back to it later.
However, there are solutions to attempt to hide issues with numerical errors. You need to estimate the maximum possible error on a computed value. With an error analysis, you can obtain rigorous bounds, but this may be difficult or take time to do. Note that rigorous bounds are pessimistic in general, but if you use arbitrary precision (e.g. with MPFR), this is less an issue. The analysis can be done dynamically with interval arithmetic (still pessimistic, even worse). But perhaps a simple estimate is sufficient for you. Once you have an estimate of the maximum error:
For the output, choose the number of displayed digits so that the error is less than the weight of the last displayed digit.
For discontinuous functions (e.g. equality test, floor, ceil): if the distance between the computed value and a discontinuity point is less than the maximum error, assume that the actual value is equal to the discontinuity point. Note that this is just a heuristic, but if it fails (this may remain unnoticed and will probably invalidate your estimate), this means that you have not done your computations with enough precision.
Note: MPFR won't do that for you. But you can write code to take these rules into account.
The iRRAM package, which is based on MPFR, can track the error in a rigorous way (like with interval arithmetic) and automatically redo all the computations in a higher precision if it notices that the accuracy is too low. However, if some mathematical result is a discontinuity point, iRRAM won't help. In particular, it cannot provide a rigorous equality test.
Finally, I suggest that you have a look at Goldberg's paper What Every Computer Scientist Should Know About Floating-Point Arithmetic, in particular the notion of cancellation.

Numerical Accuracy: to scale or not?

I am working on a n-body gravitational simulator that takes input and produces output in metric MKS units. This involves dealing with some very large numbers (like solar masses expressed in kilograms, semimajor axes of planetary orbits expressed in meters, and timescales of years expressed in seconds), which get multiplied by some very small numbers (notably, the gravitational constant, which is 6.67384e-11 in MKS units), and also the occasional very small number getting added to or subtracted from a very large number (mainly when summing up pairwise accelerations), which gets me concerned about the effects of rounding errors.
I've already taken the step of replacing all masses m by Gm (premultiplying by the gravitational constant), which significantly reduces the total number of multiplies, and makes the mass numbers much smaller, and that seems to have had a positive effect on both efficiency and accuracy, as judged by how well the simulator conserves energy.
I am wondering, however: is potentially it worth trying to do some internal re-scaling into different units to further minimize floating point errors? And if so, what kind of range (for double-precision floats) should I be trying to get my numbers centered on for maximum accuracy?
In general if you want precise results in physical based rendering you don't want to use floats or doubles since they have massive rounding problems and thus introduce errors in your simulation.
If you need or want to stick with floats/double you probably should rescale around zero. The reason is that often floating point representations have a higher "density" of values around this point and tend to have fewer on the min/max sides. Image example from google
I would suggest that you change all values to integer based number variables. This erases rounding errors (over/underflow can still happen!) and speeds up the calculation process by an order of magnitude because normal CPUs work faster with integer operations. In case of GPU its basically the same but thats another story all by its own...
But before you take such an effort to further improve your accuracy i would strongly advise an arbitrary precision number library. This may come with an performance loss but should be way easier and yield better results than a rescaling of your values.
Most of the numerical mathematicians come across this problem.
At first let me remind you that you can not deal with numbers (or phsycal values) smaller than the machine epsilon for each calculation. Unfortunately the epsilon depends around which number you are analyzing. You can try eps(a) for any value of a in MATLAB, as far as I remember eps(1.0)~=2.3e-16 and eps(0)~1e-298.
That's why in numerical methods you avoid calculations using very different scaled numbers. Because one is just an ignored (smaller than its epsilon) by the other value and rounding errors are inevitable.
But what else people do? If they encounter such physical problems, before coding, mathematicians analyse the problem theoritically, they make simplifications to use similarly scaled numbers.

approximating log10[x^k0 + k1]

Greetings. I'm trying to approximate the function
Log10[x^k0 + k1], where .21 < k0 < 21, 0 < k1 < ~2000, and x is integer < 2^14.
k0 & k1 are constant. For practical purposes, you can assume k0 = 2.12, k1 = 2660. The desired accuracy is 5*10^-4 relative error.
This function is virtually identical to Log[x], except near 0, where it differs a lot.
I already have came up with a SIMD implementation that is ~1.15x faster than a simple lookup table, but would like to improve it if possible, which I think is very hard due to lack of efficient instructions.
My SIMD implementation uses 16bit fixed point arithmetic to evaluate a 3rd degree polynomial (I use least squares fit). The polynomial uses different coefficients for different input ranges. There are 8 ranges, and range i spans (64)2^i to (64)2^(i + 1).
The rational behind this is the derivatives of Log[x] drop rapidly with x, meaning a polynomial will fit it more accurately since polynomials are an exact fit for functions that have a derivative of 0 beyond a certain order.
SIMD table lookups are done very efficiently with a single _mm_shuffle_epi8(). I use SSE's float to int conversion to get the exponent and significand used for the fixed point approximation. I also software pipelined the loop to get ~1.25x speedup, so further code optimizations are probably unlikely.
What I'm asking is if there's a more efficient approximation at a higher level?
For example:
Can this function be decomposed into functions with a limited domain like
log2((2^x) * significand) = x + log2(significand)
hence eliminating the need to deal with different ranges (table lookups). The main problem I think is adding the k1 term kills all those nice log properties that we know and love, making it not possible. Or is it?
Iterative method? don't think so because the Newton method for log[x] is already a complicated expression
Exploiting locality of neighboring pixels? - if the range of the 8 inputs fall in the same approximation range, then I can look up a single coefficient, instead of looking up separate coefficients for each element. Thus, I can use this as a fast common case, and use a slower, general code path when it isn't. But for my data, the range needs to be ~2000 before this property hold 70% of the time, which doesn't seem to make this method competitive.
Please, give me some opinion, especially if you're an applied mathematician, even if you say it can't be done. Thanks.
You should be able to improve on least-squares fitting by using Chebyshev approximation. (The idea is, you're looking for the approximation whose worst-case deviation in a range is least; least-squares instead looks for the one whose summed squared difference is least.) I would guess this doesn't make a huge difference for your problem, but I'm not sure -- hopefully it could reduce the number of ranges you need to split into, somewhat.
If there's already a fast implementation of log(x), maybe compute P(x) * log(x) where P(x) is a polynomial chosen by Chebyshev approximation. (Instead of trying to do the whole function as a polynomial approx -- to need less range-reduction.)
I'm an amateur here -- just dipping my toe in as there aren't a lot of answers already.
One observation:
You can find an expression for how large x needs to be as a function of k0 and k1, such that the term x^k0 dominates k1 enough for the approximation:
x^k0 +k1 ~= x^k0, allowing you to approximately evaluate the function as
k0*Log(x).
This would take care of all x's above some value.
I recently read how the sRGB model compresses physical tri stimulus values into stored RGB values.
It basically is very similar to the function I try to approximate, except that it's defined piece wise:
k0 x, x < 0.0031308
k1 x^0.417 - k2 otherwise
I was told the constant addition in Log[x^k0 + k1] was to make the beginning of the function more linear. But that can easily be achieved with a piece wise approximation. That would make the approximation a lot more "uniform" - with only 2 approximation ranges. This should be cheaper to compute due to no longer needing to compute an approximation range index (integer log) and doing SIMD coefficient lookup.
For now, I conclude this will be the best approach, even though it doesn't approximate the function precisely. The hard part will be proposing this change and convincing people to use it.

Need help generating discrete random numbers from distribution

I searched the site but did not find exactly what I was looking for... I wanted to generate a discrete random number from normal distribution.
For example, if I have a range from a minimum of 4 and a maximum of 10 and an average of 7. What code or function call ( Objective C preferred ) would I need to return a number in that range. Naturally, due to normal distribution more numbers returned would center round the average of 7.
As a second example, can the bell curve/distribution be skewed toward one end of the other? Lets say I need to generate a random number with a range of minimum of 4 and maximum of 10, and I want the majority of the numbers returned to center around the number 8 with a natural fall of based on a skewed bell curve.
Any help is greatly appreciated....
Anthony
What do you need this for? Can you do it the craps player's way?
Generate two random integers in the range of 2 to 5 (inclusive, of course) and add them together. Or flip a coin (0,1) six times and add 4 to the result.
Summing multiple dice produces a normal distribution (a "bell curve"), while eliminating high or low throws can be used to skew the distribution in various ways.
The key is you are going for discrete numbers (and I hope you mean integers by that). Multiple dice throws famously generate a normal distribution. In fact, I think that's how we were first introduced to the Gaussian curve in school.
Of course the more throws, the more closely you approximate the bell curve. Rolling a single die gives a flat line. Rolling two dice just creates a ramp up and down that isn't terribly close to a bell. Six coin flips gets you closer.
So consider this...
If I understand your question correctly, you only have seven possible outcomes--the integers (4,5,6,7,8,9,10). You can set up an array of seven probabilities to approximate any distribution you like.
Many frameworks and libraries have this built-in.
Also, just like TokenMacGuy said a normal distribution isn't characterized by the interval it's defined on, but rather by two parameters: Mean μ and standard deviation σ. With both these parameters you can confine a certain quantile of the distribution to a certain interval, so that 95 % of all points fall in that interval. But resticting it completely to any interval other than (−∞, ∞) is impossible.
There are several methods to generate normal-distributed values from uniform random values (which is what most random or pseudorandom number generators are generating:
The Box-Muller transform is probably the easiest although not exactly fast to compute. Depending on the number of numbers you need, it should be sufficient, though and definitely very easy to write.
Another option is Marsaglia's Polar method which is usually faster1.
A third method is the Ziggurat algorithm which is considerably faster to compute but much more complex to program. In applications that really use a lot of random numbers it may be the best choice, though.
As a general advice, though: Don't write it yourself if you have access to a library that generates normal-distributed random numbers for you already.
For skewing your distribution I'd just use a regular normal distribution, choosing μ and σ appropriately for one side of your curve and then determine on which side of your wanted mean a point fell, stretching it appropriately to fit your desired distribution.
For generating only integers I'd suggest you just round towards the nearest integer when the random number happens to fall within your desired interval and reject it if it doesn't (drawing a new random number then). This way you won't artificially skew the distribution (such as you would if you were clamping the values at 4 or 10, respectively).
1 In testing with deliberately bad random number generators (yes, worse than RANDU) I've noticed that the polar method results in an endless loop, rejecting every sample. Won't happen with random numbers that fulfill the usual statistic expectations to them, though.
Yes, there are sophisticated mathematical solutions, but for "simple but practical" I'd go with Nosredna's comment. For a simple Java solution:
Random random=new Random();
public int bell7()
{
int n=4;
for (int x=0;x<6;++x)
n+=random.nextInt(2);
return n;
}
If you're not a Java person, Random.nextInt(n) returns a random integer between 0 and n-1. I think the rest should be similar to what you'd see in any programming language.
If the range was large, then instead of nextInt(2)'s I'd use a bigger number in there so there would be fewer iterations through the loop, depending on frequency of call and performance requirements.
Dan Dyer and Jay are exactly right. What you really want is a binomial distribution, not a normal distribution. The shape of a binomial distribution looks a lot like a normal distribution, but it is discrete and bounded whereas a normal distribution is continuous and unbounded.
Jay's code generates a binomial distribution with 6 trials and a 50% probability of success on each trial. If you want to "skew" your distribution, simply change the line that decides whether to add 1 to n so that the probability is something other than 50%.
The normal distribution is not described by its endpoints. Normally it's described by it's mean (which you have given to be 7) and its standard deviation. An important feature of this is that it is possible to get a value far outside the expected range from this distribution, although that will be vanishingly rare, the further you get from the mean.
The usual means for getting a value from a distribution is to generate a random value from a uniform distribution, which is quite easily done with, for example, rand(), and then use that as an argument to a cumulative distribution function, which maps probabilities to upper bounds. For the standard distribution, this function is
F(x) = 0.5 - 0.5*erf( (x-μ)/(σ * sqrt(2.0)))
where erf() is the error function which may be described by a taylor series:
erf(z) = 2.0/sqrt(2.0) * Σ∞n=0 ((-1)nz2n + 1)/(n!(2n + 1))
I'll leave it as an excercise to translate this into C.
If you prefer not to engage in the exercise, you might consider using the Gnu Scientific Library, which among many other features, has a technique to generate random numbers in one of many common distributions, of which the Gaussian Distribution (hint) is one.
Obviously, all of these functions return floating point values. You will have to use some rounding strategy to convert to a discrete value. A useful (but naive) approach is to simply downcast to integer.