In Wolfram Alpha, if I enter the following function:
excessMachinesTotalPrice[p_, y_, d_, f_, w_]:= (1-d)*p/y*Total[excessMachinesDistribution[y, f, w]]
It is interpreted as:
excessMachinesTotalPrice[p_, y_, d_, f_, w_] == (1 - d)*(p/y)* Sum[excessMachinesDistribution[y, f, w], f]
Which is wrong. See Total has been replaced by Sum.
If I change the y argument for I(capital i):
excessMachinesTotalPrice[p_, I_, d_, f_, w_]:= (1-d)*p/I*Total[excessMachinesDistribution[I, f, w]]
I get the right outcome (input is interpreted as specified)
Why is this?
Related
I'd like to fit experimental data to a model and extract the optimal model parameters, the parameters that result in minimal error between model function and experimental data. To get the optimal parameters, I'd like to use a gradient descent method, tensorflow, Bayesian inference or basinhopping or something that deals well with bad initial estimates and is rigid. To speed things up, I'd like to use the analytical gradient for example in basinhopping. How do I do that with the basinghopping routine from scipy. In the following example code, I have some example function and I'd like to use the analytical Jacobian instead of the numerical one, but I get an error. Do I have to sum up the Jacobian components?
Example code (my actual function is much more complex)
import random
import matplotlib.pyplot as plt
import numpy as np
# symbolic math
from sympy import lambdify, symbols, cos
from sympy.tensor.array import derive_by_array
# fitting
from scipy.optimize import basinhopping
# symbolic math with sympy ---
s_lst = x, a, b, c, d = symbols('x, a, b, c d', positive=True)
# mathematical function
y = a*x + cos(b*x)**2 * c*x**2 + d
# jacobian (derivatives after model parameters)
params = s_lst[1:]
jac_y = derive_by_array(y, params)
# translate sympy expression to python function
# function
get_y = lambdify(s_lst, y)
# jacobian (derivatives in a, b, c, d)
get_jac_y = [lambdify(s_lst, element) for element in jac_y]
#print(len(get_jac_y))
# data ---
x = np.linspace(0, 1, 500)
# measurement data
a = [random.randrange(4, 6, 1) for i in range(len(x))]
b = [random.randrange(3190, 3290, 1) for i in range(len(x))]
c = [random.randrange(90, 109, 1) for i in range(len(x))]
d = [0.1*random.randrange(0, 2, 1) for i in range(len(x))]
y_measured = get_y(x, a, b, c, d)
# exemplary model data
a, b, c, d = 5, 3200, 100, 1
y_model = get_y(x, a, b, c, d)
# plot
plt.plot(x, y_measured)
plt.plot(x, y_model)
plt.title('exemplary model and measured data')
plt.show()
# functions for fitting
def func(params, args1, args2=None):
a, b, c, d = params
y = get_y(args1, a, b, c, d)
if args2 is None:
return y
return np.sum((y - args2)**2)
# derivatives
def dfunc(params, args1, args2):
a, b, c, d = params
jac = [jac(args1, a, b, c, d) for jac in get_jac_y]
# because derviative in d is one
jac[-1] = np.ones(len(args1))
return np.asarray(jac)
# function and derivatives
def objective_func(params, args1, args2):
f = func(params, args1, args2)
df = dfunc(params, args1, args2)
return f, df
# fit with basinhopping and scipy ---
# initial model parameters
x0 = [1, 2, 33, 4]
# minimization with numerical jacobian, gives a result
minimizer_kwargs = {"args":(x, y_measured), 'method':'L-BFGS-B'}
ret = basinhopping(func, x0, minimizer_kwargs=minimizer_kwargs)
# minimization with analytical jacobian, fails,
# error: failed in converting 7th argument `g' of _lbfgsb.setulb to C/Fortran array
minimizer_kwargs = {"args":(x, y_measured), 'method':'L-BFGS-B', 'jac':True}
ret = basinhopping(objective_func, x0, minimizer_kwargs=minimizer_kwargs)
If I put in dfunc something like return [np.sum((j)) for j in jac] the program runs but fails. What would be the correct expression?
I want to verify that v will give me the same result if I follow the equation
(A - IL)x = 0, however the result are all zeros. Any idea why this would happen? Is there something wrong with my equation implementation?
def all_eigenvectors(Z, L):
eigenvector = []
for eigenvalue in L:
eigenvector.append(np.linalg.solve(Z - eigenvalue * np.identity(20), [[0] for _ in range(20)]))
return eigenvector
print(all_eigenvectors(Z, L))
w, v = np.linalg.eig(Z)
tf.svd() function run slowly, i replace it with np.linalg.svd() by tf.py_func(). However, its gradient is None.
For example:
As to a matrix A, it is:
A = U*S*T
To compute U,S, T. We can use tf.svd() in tensowflow, which means:
S = f(A)
U = f(A)
T = f(A)
In tensorflow, we can compute S, U, T by:
S, U, T = tf.svd(A)
and get gradient.
gradient = tf.gradients(S,[A])
However, we replace tf.svd() with np.linalg.svd() by tf.py_func(). Because tf.svd() run very slowly. However, tf.gradients(S,[A]) return NoneType.
So how to compute tf.gradients(S,[A]) if we replace tf.svd() by np.linalg.svd().
I have a function f that is internally using some tf.while_loops and tf.gradients to compute the value y = f(x). Something like this
def f( x ):
...
def body( g, x ):
# Compute the gradient here
grad = tf.gradients( g, x )[0]
...
return ...
return tf.while_loop( cond, body, parallel_iterations=1 )
There are a few hundred lines of code. But I believe that those are the important points...
Now when I evaluate f(x), I get exactly the value I expect ..
y = known output of f(x)
with tf.Session() as sess:
fx = f(x)
print("Error = ", y - sess.run(fx, feed_dict)) # Prints 0
However, when I try to evaluate the gradient of f(x) with respect to x, that is,
grads = tf.gradients( fx, x )[0]
I get the error
AssertionError: gradients list should have been aggregated by now.
Here is the full trace:
File "C:/Dropbox/bob/tester.py", line 174, in <module>
grads = tf.gradients(y, x)[0]
File "C:\Anaconda36\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 649, in gradients
return [_GetGrad(grads, x) for x in xs]
File "C:\Anaconda36\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 649, in <listcomp>
return [_GetGrad(grads, x) for x in xs]
File "C:\Anaconda36\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 727, in _GetGrad
"gradients list should have been aggregated by now.")
AssertionError: gradients list should have been aggregated by now.
Could somebody please outline likely causes for this error? I have no idea where to even start looking for the issue...
Some observations:
Note that I have set the parallel iterations for the while loop to 1. This
should mean that there is no errors due to reading and writing from multiple threads.
If I discard the while loop, and just have f return body(), then the code runs:
# The following does not crash, but we removed the while_loop, so the output is incorrect
def f( x ):
...
def body( g, x ):
# Compute the gradient here
grad = tf.gradients( g, x )[0]
...
return ...
return body(...)
Obviously, the output is incorrect, but at least the gradients are computed.
I came across a similar issue. Some patterns I noted:
If the x used in tf.gradients was used in a manner that required dimension broadcasting in body, I got this error. If I changed it to one that didn't require broadcasting, tf.gradients returned [None]. I didn't test this extensively, so this pattern may not be consistent across all examples.
Both cases (returning [None] and raising this assertion error) can be resolved by differentiating tf.identity(y) rather than just y: grads = tf.gradients(tf.identity(y), xs) I have absolutely no idea why this works.
I have the following code:
dotp = np.dot(X[i], w)
mult = -Y[i] * dotp
lhs = Y[i] * X[i]
rhs = logistic(mult)
s += lhs * rhs
And it throws me the following error (truncated for brevity):
File "/Users/leonsas/Projects/temp/learners/learners.py", line 26, in log_likelihood_grad
s += lhs * rhs
File "/usr/local/lib/python2.7/site-packages/numpy/matrixlib/defmatrix.py", line 341, in __mul__
return N.dot(self, asmatrix(other))
`ValueError: matrices are not aligned`
I was expecting lhs to be a column vector and rhs to be a scalar and so that operation should work.
To debug, I printed out the dimensions:
print "lhs", np.shape(lhs)
print "rhs", rhs, np.shape(rhs)
Which outputs:
lhs (1, 18209)
rhs [[ 0.5]] (1, 1)
So it seems that they are compatible for a multiplication. Any thoughts as to what am I doing wrong?
EDIT: More information of what I'm trying to do.
This code is to implement a log-likehood gradient to estimate coefficients.
Where z is the dot product of the weights with the x values.
My attempt at implementing this:
def log_likelihood_grad(X, Y, w, C=0.1):
K = len(w)
N = len(X)
s = np.zeros(K)
for i in range(N):
dotp = np.dot(X[i], w)
mult = -Y[i] * dotp
lhs = Y[i] * X[i]
rhs = logistic(mult)
s += lhs * rhs
s -= C * w
return s
You have a matrix lhs of shape (1, 18209) and rhs of shape (1, 1) and you are trying to multiply them. Since they're of matrix type (as it seems from the stack trace), the * operator translates to dot. Matrix product is defined only for the cases where the number of columns in the first matrix and the number of rows in the second one are equal, and in your case they're not (18209 and 1). Hence the error.
How to fix it: check the maths behind the code and fix the formula. Perhaps you forgot to transpose the first matrix or something like that.
vectors' shape on numpy lib are like (3,). when you try to multiply them with np.dot(a,b) func it gives dim error. np.outer(a,b) func should be used at this point.