how to calculate the gradient in python numpy - numpy

i have to implement the Stochastic Gradient Descent in Numpy. So I've to define the gradient of this function E:
In which also f and g are defined in the image.
I've no idea of how to do this, I tried with Sympy and numdifftools but these libraries give me some errors.
How could I write the gradient of the function E?
Thank you

you mean this?
import numpy as np
# G function
def g(x):
return np.tanh(x/2)
# F function
def f(x, N, n, v, g):
sumf = 0
for j in range(1, N):
sumi = 0
for i in range(1, n):
sumi += w[j, i]*x[i] - b[j]
sumf += v[j]*g(sumi)
return sumf

Related

How do I vectorize a function in numpy with some fixed parameters?

I have written a code for approximating a function with the Bernstein polynomials ( https://en.wikipedia.org/wiki/Bernstein_polynomial )
at
https://github.com/pdenapo/metodos-numericos/blob/master/python/bernstein.py
I have a function that gives the polynomial approximating f as bernstein(f, n, p) (where f is the function that I want to approximate, n is the degree and p the point where it is evaluated.
def bernstein(f, n, p):
return np.sum(
[f(k / n) * st.binom.pmf(k, n, p) for k in np.arange(0, n + 1)])
Now I want to generate a plot of this function where f and n es fixed, and p runs though a vector generated by np.arrange
So I am vectorizing the function in the following way:
bernstein3 = lambda x: bernstein(f, 3, x)
bernstein3 = np.vectorize(bernstein3)
y3 = bernstein3(x)
plt.plot(x, y3, 'green', label='$B_3$')
It works. But I guess there must be some more elegant, or perhaps more pythonic way of doing this. Any suggestions? Many thanks
Since SciPy statistic functions are vectorized, your bernstein function can be modified in a straightforward manner to work that way:
import numpy as np
import scipy.stats
def bernstein(f, n, p):
# Vector of k values
k = np.arange(n + 1)
# Add a broadcasting dimension to p
pd = np.expand_dims(p, -1)
# Compute approximation
return np.sum(f(k / n) * scipy.stats.binom.pmf(k, n, pd), -1)
It would be used simply as this:
import numpy as np
import matplotlib.pyplot as plt
def f(x):
return np.abs(1 / 2 - x)
x = np.linspace(0, 1, 100)
y = f(x)
plt.plot(x, y, 'blue', label='f(x)')
y_approx = bernstein(f, 10, x)
plt.plot(x, y_approx, 'orange', label='f_approx(x)')
plt.show()

Levi-Civita tensor in numpy

I am looking for compact numpy code to produce the Levi-Civita tensor in any user-selected number of dimensions. Any ideas?
From the sympy tensor functions:
In [13]: tensor_functions.eval_levicivita(x,y,z)
Out[13]:
(-x + y)⋅(-x + z)⋅(-y + z)
──────────────────────────
2
def eval_levicivita(*args):
"""Evaluate Levi-Civita symbol."""
from sympy import factorial
n = len(args)
return prod(
prod(args[j] - args[i] for j in range(i + 1, n))
/ factorial(i) for i in range(n))
File: /usr/local/lib/python3.6/dist-packages/sympy/functions/special/tensor_functions.py
Type: function
For a reasonable number of dimensions the tensor size isn't that big, so I wouldn't worry about efficiency. For a start I'd try an iterative solution; it doesn't need to be fancy.
Using itertools
import numpy as np
import itertools
def levi_cevita_tensor(dim):
arr=np.zeros(tuple([dim for _ in range(dim)]))
for x in itertools.permutations(tuple(range(dim))):
mat = np.zeros((dim, dim), dtype=np.int32)
for i, j in zip(range(dim), x):
mat[i, j] = 1
arr[x]=int(np.linalg.det(mat))
return arr
https://en.wikipedia.org/wiki/Levi-Civita_symbol#Product

How and why is FFT convolution faster than direct convolution?

I read about convolutions being faster when computed into the frequency domain because it's "just" a matrix multiplication (in 2D), while in the time domain it's a lot of small matrix multiplication.
So I made this code we can see that FFT convolution is more complex than "normal" convolution.
It's clear that something is wrong in my assumptions.
What is wrong ?
from sympy import exp, log, symbols, init_printing, lambdify
init_printing(use_latex='matplotlib')
import numpy as np
import matplotlib.pyplot as plt
def _complex_mult(n):
"""Complexity of a MatMul of a 2 matrices of size (n, n)"""
# see https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm
return n**2.5
def _complex_fft(n):
"""Complexity of fft and ifft"""
# see https://en.wikipedia.org/wiki/Fast_Fourier_transform
return n*log(n)
def fft_mult_fft(n, m):
"""Complexity of a convolution in the freq space.
fft -> mult between M and kernel -> ifft
"""
return _complex_fft(n) * 2 + _complex_mult(n)
def conv(n, m):
"""Complexity of a convolution in the time space.
for every n of M, we execute a MatMul of 2 (m, m) matrices
"""
return n*_complex_mult(m)
n = symbols('n') # size of M = (n, n)
m = symbols('m') # size of kernel = (m, m)
M = np.linspace(1, 1e3+1, 1e1)
kernel_size = np.linspace(2, 7, 7-2+1)**2
fft = fft_mult_fft(n, m)
discrete = conv(n, m)
f1 = lambdify(n, fft, 'numpy')
f2 = lambdify([n, m], discrete, 'numpy')
fig, ax = plt.subplots(1, len(kernel_size), figsize=(30, 10))
f1_computed = f1(M) # independant wrt m, do not compute it at each time
for i, size in enumerate(kernel_size):
ax[i].plot(M, f1_computed, c='red', label='freq domain (fft)')
ax[i].plot(M, f2(M, size), c='blue', label='time domain (normal)')
ax[i].legend(loc='upper left')
ax[i].set_title("kernel size = {}".format(size))
ax[i].set_xlabel("Matrix size")
ax[i].set_ylabel("Complexity")
And here is the output: (click to zoom)
You are experiencing two well-known facts:
for small kernel sizes, the spatial approach is faster,
for large kernel sizes, the frequency approach can be faster.
Your kernels and images are relatively too small to observe the benefits of the FFT.
As #user545424 pointed out, the problem was that I was computing n*complexity(MatMul(kernel)) instead of n²*complexity(MatMul(kernel)) for a "normal" convolution.
I finally get this: (where n is the size of the input and m the size of the kernel)
Here is the final code and the new charts.
from sympy import exp, log, symbols, init_printing, lambdify
init_printing(use_latex='matplotlib')
import numpy as np
import matplotlib.pyplot as plt
def _complex_mult(n):
"""Complexity of a MatMul of a 2 matrices of size (n, n)"""
# see https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm
return n**2.5
def _complex_fft(n):
"""Complexity of fft and ifft"""
# see https://stackoverflow.com/questions/6514861/computational-complexity-of-the-fft-in-n-dimensions#comment37078975_6516856
return 4*(n**2)*log(n)
def fft_mult_fft(n, m):
"""Complexity of a convolution in the freq space.
fft -> mult between M and kernel -> ifft
"""
return _complex_fft(n) * 2 + _complex_mult(n)
def conv(n, m):
"""Complexity of a convolution in the time space.
for every n*n cell of M, we execute a MatMul of 2 (m, m) matrices
"""
return n*n*_complex_mult(m)
n = symbols('n') # size of M = (n, n)
m = symbols('m') # size of kernel = (m, m)
M = np.linspace(1, 1e3+1, 1e1)
kernel_size = np.linspace(2, 7, 7-2+1)**2
fft_symb = fft_mult_fft(n, m)
discrete_symb = conv(n, m)
fft_func = lambdify(n, fft_symb, 'numpy')
dicrete_func = lambdify([n, m], discrete_symb, 'numpy')
fig, ax = plt.subplots(1, len(kernel_size), figsize=(30, 10))
fig.patch.set_facecolor('grey')
for i, size in enumerate(kernel_size):
ax[i].plot(M, fft_func(M), c='red', label='freq domain (fft)')
ax[i].plot(M, dicrete_func(M, size), c='blue', label='time domain (normal)')
ax[i].legend(loc='upper left')
ax[i].set_title("kernel size = {}".format(size))
ax[i].set_xlabel("Matrix size")
ax[i].set_ylabel("Complexity")

Implementing minimization in SciPy

I am trying to implement the 'Iterative hessian Sketch' algorithm from https://arxiv.org/abs/1411.0347 page 12. However, I am struggling with step two which needs to minimize the matrix-vector function.
Imports and basic data generating function
import numpy as np
import scipy as sp
from sklearn.datasets import make_regression
from scipy.optimize import minimize
import matplotlib.pyplot as plt
%matplotlib inline
from numpy.linalg import norm
def generate_data(nsamples, nfeatures, variance=1):
'''Generates a data matrix of size (nsamples, nfeatures)
which defines a linear relationship on the variables.'''
X, y = make_regression(n_samples=nsamples, n_features=nfeatures,\
n_informative=nfeatures,noise=variance)
X[:,0] = np.ones(shape=(nsamples)) # add bias terms
return X, y
To minimize the matrix-vector function, I have tried implementing a function which computes the quanity I would like to minimise:
def f2min(x, data, target, offset):
A = data
S = np.eye(A.shape[0])
#S = gaussian_sketch(nrows=A.shape[0]//2, ncols=A.shape[0] )
y = target
xt = np.ravel(offset)
norm_val = (1/2*S.shape[0])*norm(S#A#(x-xt))**2
#inner_prod = (y - A#xt).T#A#x
return norm_val - inner_prod
I would eventually like to replace S with some random matrices which can reduce the dimensionality of the problem, however, first I need to be confident that this optimisation method is working.
def grad_f2min(x, data, target, offset):
A = data
y = target
S = np.eye(A.shape[0])
xt = np.ravel(offset)
S_A = S#A
grad = (1/S.shape[0])*S_A.T#S_A#(x-xt) - A.T#(y-A#xt)
return grad
x0 = np.zeros((X.shape[0],1))
xt = np.zeros((2,1))
x_new = np.zeros((2,1))
for it in range(1):
result = minimize(f2min, x0=xt,args=(X,y,x_new),
method='CG', jac=False )
print(result)
x_new = result.x
I don't think that this loop is correct at all because at the very least there should be some local convergence before moving on to the next step. The output is:
fun: 0.0
jac: array([ 0.00745058, 0.00774882])
message: 'Desired error not necessarily achieved due to precision loss.'
nfev: 416
nit: 0
njev: 101
status: 2
success: False
x: array([ 0., 0.])
Does anyone have an idea if:
(1) Why I'm not achieving convergence at each step
(2) I can implement step 2 in a better way?

output[0] = y0 ValueError: setting an array element with a sequence

I have been struggling for a few days with this. I'm trying t estimate the density of a piecewise gaussian function. Could anyone tell me why I'm now getting the error
TypeError: output[0] = y0
ValueError: setting an array element with a sequence.
It happens on this line:
Zero_RG = integrate.romberg(gaussian(q,x,mu,sigma), Q1, Q2).`
Here is the script:
import numpy as np
import sympy as sp
from sympy import *
from scipy import integrate
from sympy import Integral, log, exp, sqrt, pi
import matplotlib.pyplot as plt
from scipy.stats import norm, gaussian_kde
from quantecon import LAE
from sympy import symbols
var('Q1 Q2 x q sigma mu')
#q= symbols('q')
## == Define parameters == #
mu=80
sigma=20
b=0.2
Q=80
Q1=Q*(1-b)
Q2=Q*(1+b)
d = (sigma*np.sqrt(2*np.pi))
phi = norm()
n = 500
def p(x, y):
x, y = np.array(x, dtype=float), np.array(y, dtype=float)
Positive_RG = norm.pdf(x-y+Q1, mu, sigma)
print('Positive_R = ', Positive_RG)
Negative_RG = norm.pdf(x-y+Q2, mu, sigma)
print('Negative_RG = ', Negative_RG)
gaussian = lambda q,x,mu,sigma: 1/(sigma*np.sqrt(2*np.pi))*np.exp(-(x+q-mu)**2 /(2*sigma**2))
wrapped_gaussian = lambda q: gaussian(q, x, mu, sigma)
Zero_RG = integrate.romberg(wrapped_gaussian, Q1, Q2)
print('pdf',gaussian)
#Zero_RG = scipy.integrate.quad(norm.pdf(x + q, mu, sigma))
# Int_zerocase= lambda q: norm.pdf(x + q, u, sigma)
# Zero_RG = scipy.integrate.quad(Int_zerocase, Q1, Q2)
# print(Zero_RG)
if y>0.0 and x -y>=-Q1:
#print('printA', Positive_RG)
return Positive_RG
elif y<0.0 and x -y>=-Q2:
#print('printC', Negative_RG)
return Negative_RG
elif y==0.0 and x >=-Q1:
#print('printB', Zero_RG)
return Zero_RG
return 0.0
Z = phi.rvs(n)
X = np.empty(n)
for t in range(n-1):
X[t+1] = X[t] + Z[t]
#X[t+1] = np.abs(X[t]) + Z[t]
psi_est = LAE(p, X)
k_est = gaussian_kde(X)
fig, ax = plt.subplots(figsize=(10,7))
ys = np.linspace(-200.0, 200.0, 200)
ax.plot(ys, psi_est(ys), 'g-', lw=2, alpha=0.6, label='look ahead estimate')
ax.plot(ys, k_est(ys), 'k-', lw=2, alpha=0.6, label='kernel based estimate')
ax.legend(loc='upper left')
plt.show()
The docs for romberg are pretty clear that the first argument is a function of a single variable that gets integrated.
First, a minor point. Use np.exp in preference to np.e**.
In Python, the expression
gaussian = lambda q,x,mu,sigma: 1/(sigma*np.sqrt(2*np.pi))*np.exp(-(x+q-mu)**2 /(2*sigma**2))
sets gaussian to a function of four arguments. The expression gaussian(q, x, mu, sigma) is just the return value of that function.
There are two ways to pass in the required parameters to the romberg. The easiest way is to use the args parameter to pass in the three additional arguments as a tuple:
Zero_RG = integrate.romberg(gaussian, Q1, Q2, args=(x,mu,sigma))
The other way is to create a wrapper function that will pass the arguments for you:
wrapped_gaussian = lambda q: gaussian(q, x, mu, sigma)
Zero_RG = integrate.romberg(wrapped_gaussian, Q1, Q2)
I would recommend the fist approach because it uses an existing mechanism, as well as being shorter and easier to read.