In sympy 0.7.6, I had no troubles with the following code for both the modules='sympy' and the modules='numpy' options. Now with sympy v0.1, the evaluation with modules='numpy' raise a ZeroDivisionError:
import sympy
x, y = sympy.symbols(['x', 'y'])
expr = sympy.Piecewise((1/x, y < -1), (x, y <= 1), (1/x, True))
f_sympy = sympy.lambdify([x, y], expr, modules='sympy')
f_numpy = sympy.lambdify([x, y], expr, modules='numpy')
print f_sympy(0, 1) # performs well
print f_numpy(0, 1) # issue: ZeroDivisionError
Seems like the piecewise functions evaluate before the condition with modules='numpy'.
My questions are:
Is this behavior normal?
If so, why, and how to define a piecewise expression and evaluate it as fast as with numpy module without the sympy.lambdify procedure?
EDIT:
Found that in my case the solution is theano:
import sympy
x, y = sympy.symbols(['x', 'y'])
f = sympy.Piecewise((1/x, y < -1), (x, y <= 1), (1/x, True))
from sympy.printing.theanocode import theano_function
f_theano = theano_function([x, y], [f])
print f_theano(0, 1) # OK, return 0
I deleted my other answer (in case you already saw it). There is a much simpler solution.
The ZeroDivisionError comes because the lambdified expression produces, roughly, lambda x, y: select([less(y, -1),less_equal(y, 1),True], [1/x,x,1/x], default=nan). The problem is that passing in x = 0 results in 1/0 being evaluated by Python, which raises the error.
But NumPy is just fine with dividing by zero. It will issue a warning, but otherwise works fine (it gives inf), and in this example there is no problem, because the inf is not actually used.
So the solution is to wrap the input to lambdify as numpy arrays, that is, instead of
f_numpy(0, 1)
use
f_numpy(array(0), array(1))
There is a SymPy issue discussing this if you are interested.
Related
I am working on a problem in which a matrix has to be mean-var normalized row-wise. It is also required that the normalization is applied after splitting each row into tiny batches.
The code seem to work for Numpy, but fails with Pytorch (which is required for training).
It seems Pytorch and Numpy results differ. Any help will be greatly appreciated.
Example code:
import numpy as np
import torch
def normalize(x, bsize, eps=1e-6):
nc = x.shape[1]
if nc % bsize != 0:
raise Exception(f'Number of columns must be a multiple of bsize')
x = x.reshape(-1, bsize)
m = x.mean(1).reshape(-1, 1)
s = x.std(1).reshape(-1, 1)
n = (x - m) / (eps + s)
n = n.reshape(-1, nc)
return n
# numpy
a = np.float32(np.random.randn(8, 8))
n1 = normalize(a, 4)
# torch
b = torch.tensor(a)
n2 = normalize(b, 4)
n2 = n2.numpy()
print(abs(n1-n2).max())
In the first example you are calling normalize with a, a numpy.ndarray, while in the second you call normalize with b, a torch.Tensor.
According to the documentation page of torch.std, Bessel’s correction is used by default to measure the standard deviation. As such the default behavior between numpy.ndarray.std and torch.Tensor.std is different.
If unbiased is True, Bessel’s correction will be used. Otherwise, the sample deviation is calculated, without any correction.
torch.std(input, dim, unbiased, keepdim=False, *, out=None) → Tensor
Parameters
input (Tensor) – the input tensor.
unbiased (bool) – whether to use Bessel’s correction (δN = 1).
You can try yourself:
>>> a.std(), b.std(unbiased=True), b.std(unbiased=False)
(0.8364538, tensor(0.8942), tensor(0.8365))
We're trying to implement a piecewise function, basically around 100 polynomials with different coefficients depending on the value of x.
This will be implemented in TensorFlow or jax with JIT and be optimized for arrays of data. The question is what is probably the best way to achieve this?
One could use one-hundred wheres, but that is not really optimal. Or use the tf.switch_case with tf.vectorize_map (or similar).
Are there any ideas?
If I understand correctly, I think that jax.lax.switch provides the kind of functionality you're interested in. For example:
import jax.numpy as jnp
from jax import vmap, lax
import matplotlib.pyplot as plt
def f1(x):
return 0.0 * x
def f2(x):
return (x - 1.0) ** 2
def f3(x):
return 2 * x - 3
branches = (f1, f2, f3)
bounds = jnp.array([1, 2]) # boundaries between branches
x = jnp.linspace(0, 3)
index = jnp.searchsorted(bounds, x) # index in branches for each value in x
result = vmap(lambda i, x: lax.switch(i, branches, x))(index, x)
plt.plot(x, result)
I have an equation dy/dx = x + y/5 and an initial value, y(0) = -3.
I would like to know how to plot the exact graph of this function using pyplot.
I also have a x = np.linspace(0, interval, steps+1) which I would like to use as the x axis. So I'm only looking for the y axis values.
Thanks in advance.
Just for completeness, this kind of equation can easily be integrated numerically, using scipy.integrate.odeint.
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# function dy/dx = x + y/5.
func = lambda y,x : x + y/5.
# Initial condition
y0 = -3 # at x=0
# values at which to compute the solution (needs to start at x=0)
x = np.linspace(0, 4, 101)
# solution
y = odeint(func, y0, x)
# plot the solution, note that y is a column vector
plt.plot(x, y[:,0])
plt.xlabel('x')
plt.ylabel('y')
plt.show()
Given that you need to solve the d.e. you might prefer doing this algebraically, with sympy. (Or you might not.)
Import the module and define the function and the dependent variable.
>>> from sympy import *
>>> f = Function('f')
>>> var('x')
x
Invoke the solver. Note that all terms of the d.e. must be transposed to the left of the equals sign, and that the y must be replaced by the designator for the function.
>>> dsolve(Derivative(f(x),x)-x-f(x)/5)
Eq(f(x), (C1 + 5*(-x - 5)*exp(-x/5))*exp(x/5))
As you would expect, the solution is given in terms of an arbitrary constant. We must solve for that using the initial value. We define it as a sympy variable.
>>> var('C1')
C1
Now we create an expression to represent this arbitrary constant as the left side of an equation that we can solve. We replace f(0) with its value in the initial condition. Then we substitute the value of x in that condition to get an equation in C1.
>>> expr = -3 - ( (C1 + 5*(-x - 5)*exp(-x/5))*exp(x/5) )
>>> expr.subs(x,0)
-C1 + 22
In other words, C1 = 22. Finally, we can use this value to obtain the particular solution of the differential equation.
>>> ((C1 + 5*(-x - 5)*exp(-x/5))*exp(x/5)).subs(C1,22)
((-5*x - 25)*exp(-x/5) + 22)*exp(x/5)
Because I'm absentminded and ever fearful of making egregious mistakes I check that this function satisfies the initial condition.
>>> (((-5*x - 25)*exp(-x/5) + 22)*exp(x/5)).subs(x,0)
-3
(Usually things are incorrect only when I forget to check them. Such is life.)
And I can plot this in sympy too.
>>> plot(((-5*x - 25)*exp(-x/5) + 22)*exp(x/5),(x,-1,5))
<sympy.plotting.plot.Plot object at 0x0000000008C2F780>
I have a k*n matrix X, and an k*k matrix A. For each column of X, I'd like to calculate the scalar
X[:, i].T.dot(A).dot(X[:, i])
(or, mathematically, Xi' * A * Xi).
Currently, I have a for loop:
out = np.empty((n,))
for i in xrange(n):
out[i] = X[:, i].T.dot(A).dot(X[:, i])
but since n is large, I'd like to do this faster if possible (i.e. using some NumPy functions instead of a loop).
This seems to do it nicely:
(X.T.dot(A)*X.T).sum(axis=1)
Edit: This is a little faster. np.einsum('...i,...i->...', X.T.dot(A), X.T). Both work better if X and A are Fortran contiguous.
You can use the numpy.einsum:
np.einsum('ji,jk,ki->i',x,a,x)
This will get the same result. Let's see if it is much faster:
Looks like dot is still the fastest option, particularly because it uses threaded BLAS, as opposed to einsum which runs on one core.
import perfplot
import numpy as np
def setup(n):
k = n
x = np.random.random((k, n))
A = np.random.random((k, k))
return x, A
def loop(data):
x, A = data
n = x.shape[1]
out = np.empty(n)
for i in range(n):
out[i] = x[:, i].T.dot(A).dot(x[:, i])
return out
def einsum(data):
x, A = data
return np.einsum('ji,jk,ki->i', x, A, x)
def dot(data):
x, A = data
return (x.T.dot(A)*x.T).sum(axis=1)
perfplot.show(
setup=setup,
kernels=[loop, einsum, dot],
n_range=[2**k for k in range(10)],
logx=True,
logy=True,
xlabel='n, k'
)
You can't do it faster unless you parallelize the whole thing: One thread per column. You'll still use loops - you can't get away from that.
Map reduce is a nice way to look at this problem: map step multiples, reduce step sums.
I don't understand why the ifft(fft(myFunction)) is not the same as my function. It seems to be the same shape but a factor of 2 out (ignoring the constant y-offset). All the documentation I can see says there is some normalisation that fft doesn't do, but that ifft should take care of that. Here's some example code below - you can see where I've bodged the factor of 2 to give me the right answer. Thanks for any help - its driving me nuts.
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:] = 0
# make new series
y2 = fftp.ifft(myfft).real
# find constant y offset
myfft[1:]=0
c = fftp.ifft(myfft)[0]
# remove c, apply factor of 2 and re apply c
y2 = (y2-c)*2 + c
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
You're removing half the spectrum when you do myfft[wn:] = 0. The negative frequencies are those in the top half of the array and are required.
You have a second fudge to get your results which is taking the real part to find y2: y2 = fftp.ifft(myfft).real (fftp.ifft(myfft) has a non-negligible imaginary part due to the asymmetry in the spectrum).
Fix it with myfft[wn:-wn] = 0 instead of myfft[wn:] = 0, and remove the fudges. So the fixed code looks something like:
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:-wn] = 0
# make new series
y2 = fftp.ifft(myfft)
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
It's really worth paying attention to the interim arrays that you are creating when trying to do signal processing. Invariably, there are clues as to what is going wrong that should direct you to the problem. In this case, you taking the real part masked the problem and made your task more difficult.
Just to add another quick point: Sometimes taking the real part of the resultant array is exactly the correct thing to do. It's often the case that you end up with an imaginary part to the signal output which is just down to numerical errors in the input to the inverse FFT. Typically this manifests itself as very small imaginary values, so taking the real part is basically the same array.
You are killing the negative frequencies between 0 and -wn.
I think what you mean to do is to set myfft to 0 for all frequencies outside [-wn, wn].
Change the following line:
myfft[wn:] = 0
to:
myfft[wn:-wn] = 0