I am attempting to solve a system of non-linear equations of the form below, using numpy:
a(y-2.7)(1-exp(-a*z)) = (x-2.7)(1-exp(-z))
b(w-2.7)(1-exp(-b*z)) = (x-2.7)(1-exp(-z))
c(w-2.7)(1-exp(-b*z)) = (y-2.7)(1-exp(-a*z)
d([y+w]/2-2.7)(1-exp(-d*z)) = (x-2.7)(1-exp(-z))
Obviously there are as many equations as unknowns in the system. The values a,b,c,d are constants for the system above. This is the simplest system, there will be more equations in other cases.
The solutions to these equations have a similar order of magnitude and so I am aware that the Levenberg-Marquardt algorithm can be used to solve the system given a set of initial values for the unknown values. I am sure that scipy.optimize can be used with default values all 1s for the unknowns w,x,y,z.
SciPy has a set of nonlinear solvers that would probably work for your application. They are part of scipy.optimize and are specifically designed for nonlinear systems. The documentation can be found here. For an in-depth discussion of how to use the solvers, see the previous S.O. discussion on the topic here.
Related
I am working on replicating a paper titled “Improving Mean Variance Optimization through Sparse Hedging Restriction”. The authors’ idea is to use Graphical Lasso algorithm to infuse some bias in the estimation process of the inverse of the sample covariance matrix. The graphical lasso algorithm works perfectly fine in R, but when I use python on the same data with the same parameters I get two sorts of errors:
If I use coordinate descent (cd ) mode as a solver, I get a floating point error saying that: FloatingPointError: Non SPD result: the system is too ill-conditioned for this solver. The system is too ill-conditioned for this solver (The thing that bugs me is that I tried this solver on a simulated Positive definite matrix and It game me this error)
If I use the Least Angle Regression (LARS) mode (Which is less stable but recommended for ill-conditioned matrices) I get an overflow error stating OverflowError: int too large to convert to float
To my knowledge, unlike C++ and other languages, python is not restricted by an upper maximum for integer numbers (besides the capacity of the machine itself). Whereas the floats are restricted. I think this might be the source of the later problem. (I have also heard in the past that R is much more robust in terms of dealing ill-conditioned matrices). I would be glad to hear you experience with graph lasso in R or python.
With this email, I have attached a little python code that simulates this problem in a few lines. Any input will be of great appreciation.
Thank you all,
Skander
from sklearn.covariance import graph_lasso
from sklearn.datasets import make_spd_matrix
symetric_PD_mx= make_spd_matrix(100)
glout = graph_lasso(emp_cov=symetric_PD_mx, alpha=0.01,mode="lars")
Since the SVD decomposition is not unique (pairs of left and right singular vectors can have their sign flipped simultaneously), I was wondering to what extent the U and V matrix returned by scipy.linalg.svd() are 'deterministic' / always the same?
I tried it a few times with a random array on my machine and it seems to always return the same thing (fortunately), but could that vary across machines?
SciPy and Numpy both compute the SVD by out-sourcing to the LAPACK _gesdd routine. Any deterministic implementation of this routine will produce the same results every time on a given machine with a given LAPACK implementation, but as far as I know there is no guarantee that different LAPACK implementations (i.e. NETLIB vs MKL, OSX vs Windows, etc.) will use the same convention. If your application depends on some convention for resolving the sign ambiguity, it would be safest to ensure it yourself in some sort of post-processing of the singular vectors; one useful approach is given in Resolving the Sign Ambiguity in the
Singular Value Decomposition (pdf)
I am working with bidimensional arrays on Numpy for Extreme Learning Machines. One of my arrays, H, is random, and I want to compute its pseudoinverse.
If I use scipy.linalg.pinv2 everything runs smoothly. However, if I use scipy.linalg.pinv, sometimes (30-40% of the times) problems arise.
The reason why I am using pinv2 is because I read (here: http://vene.ro/blog/inverses-pseudoinverses-numerical-issues-speed-symmetry.html ) that pinv2 performs better on "tall" and on "wide" arrays.
The problem is that, if H has a column j of all 1, pinv(H) has huge coefficients at row j.
This is in turn a problem because, in such cases, np.dot(pinv(H), Y) contains some nan values (Y is an array of small integers).
Now, I am not into linear algebra and numeric computation enough to understand if this is a bug or some precision related property of the two functions. I would like you to answer this question so that, if it's the case, I can file a bug report (honestly, at the moment I would not even know what to write).
I saved the arrays with np.savetxt(fn, a, '%.2e', ';'): please, see https://dl.dropboxusercontent.com/u/48242012/example.tar.gz to find them.
Any help is appreciated. In the provided file, you can see in pinv(H).csv that rows 14, 33, 55, 56 and 99 have huge values, while in pinv2(H) the same rows have more decent values.
Your help is appreciated.
In short, the two functions implement two different ways to calculate the pseudoinverse matrix:
scipy.linalg.pinv uses least squares, which may be quite compute intensive and take up a lot of memory.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.pinv.html#scipy.linalg.pinv
scipy.linalg.pinv2 uses SVD (singular value decomposition), which should run with a smaller memory footprint in most cases.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.pinv2.html#scipy.linalg.pinv2
numpy.linalg.pinv also implements this method.
As these are two different evaluation methods, the resulting matrices will not be the same. Each method has its own advantages and disadvantages, and it is not always easy to determine which one should be used without deeply understanding the data and what the pseudoinverse will be used for. I'd simply suggest some trial-and-error and use the one which gives you the best results for your classifier.
Note that in some cases these functions cannot converge to a solution, and will then raise a scipy.stats.LinAlgError. In that case you may try to use the second pinv implementation, which will greatly reduce the amount of errors you receive.
Starting from scipy 1.7.0 , pinv2 is deprecated and also replaced by a SVD solution.
DeprecationWarning: scipy.linalg.pinv2 is deprecated since SciPy 1.7.0, use scipy.linalg.pinv instead
That means, numpy.pinv, scipy.pinv and scipy.pinv2 now compute all equivalent solutions. They are also equally fast in their computation, with scipy being slightly faster.
import numpy as np
import scipy
arr = np.random.rand(1000, 2000)
res1 = np.linalg.pinv(arr)
res2 = scipy.linalg.pinv(arr)
res3 = scipy.linalg.pinv2(arr)
np.testing.assert_array_almost_equal(res1, res2, decimal=10)
np.testing.assert_array_almost_equal(res1, res3, decimal=10)
I have some data in a pandas dataframe (although pandas is not the point of this question). As an experiment I made column ZR as column Z divided by column R. As a first step using scikit learn I wanted to see if I could predict ZR from the other columns (which should be possible as I just made it from R and Z). My steps have been.
columns=['R','T', 'V', 'X', 'Z']
for c in columns:
results[c] = preprocessing.scale(results[c])
results['ZR'] = preprocessing.scale(results['ZR'])
labels = results["ZR"].values
features = results[columns].values
#print labels
#print features
regr = linear_model.LinearRegression()
regr.fit(features, labels)
print(regr.coef_)
print np.mean((regr.predict(features)-labels)**2)
This gives
[ 0.36472515 -0.79579885 -0.16316067 0.67995378 0.59256197]
0.458552051342
The preprocessing seems wrong as it destroys the Z/R relationship I think. What's the right way to preprocess in this situation?
Is there some way to get near 100% accuracy? Linear regression is the wrong tool as the relationship is not-linear.
The five features are highly correlated in my data. Is non-negative least squares implemented in scikit learn ? ( I can see it mentioned in the mailing list but not the docs.) My aim would be to get as many coefficients set to zero as possible.
You should easily be able to get a decent fit using random forest regression, without any preprocessing, since it is a nonlinear method:
model = RandomForestRegressor(n_estimators=10, max_features=2)
model.fit(features, labels)
You can play with the parameters to get better performance.
The solutions is not as easy and can be very influenced by your data.
If your variables R and Z are bounded (for ex 0<R<1 -3<Z<2) then you should be able to get a good estimation of the output variable using neural network.
Using neural network you should be able to estimate your output even without preprocessing the data and using all the variables as input.
(Of course here you will have to solve a minimization problem).
Sklearn do not implement neural network so you should use pybrain or fann.
If you want to preprocess the data in order to make the minimization problem easier you can try to extract the right features from the predictor matrix.
I do not think there are a lot of tools for non linear features selection. I would try to estimate the important variables from you dataset using in this order :
1-lasso
2- sparse PCA
3- decision tree (you can actually use them for features selection ) but I would avoid this as much as possible
If this is a toy problem I would sugges you to move towards something of more standard.
You can find a lot of examples on google.
I have a function in Python:
def f(x):
return x[0]**3 + x[1]**2 + 7
# Actually more than this.
# No analytical expression
It's a scalar valued function of a vector.
How can I approximate the Jacobian and Hessian of this function in numpy or scipy numerically?
(Updated in late 2017 because there's been a lot of updates in this space.)
Your best bet is probably automatic differentiation. There are now many packages for this, because it's the standard approach in deep learning:
Autograd works transparently with most numpy code. It's pure-Python, requires almost no code changes for typical functions, and is reasonably fast.
There are many deep-learning-oriented libraries that can do this.
Some of the most popular are TensorFlow, PyTorch, Theano, Chainer, and MXNet. Each will require you to rewrite your function in their kind-of-like-numpy-but-needlessly-different API, and in return will give you GPU support and a bunch of deep learning-oriented features that you may or may not care about.
FuncDesigner is an older package I haven't used whose website is currently down.
Another option is to approximate it with finite differences, basically just evaluating (f(x + eps) - f(x - eps)) / (2 * eps) (but obviously with more effort put into it than that). This will probably be slower and less accurate than the other approaches, especially in moderately high dimensions, but is fully general and requires no code changes. numdifftools seems to be the standard Python package for this.
You could also attempt to find fully symbolic derivatives with SymPy, but this will be a relatively manual process.
Restricted to just SciPy, the most convenient way I found was scipy.misc.derivative, within the appropriate loops, with lambdas to curry the function of interest.