Matplotlib: 'savefig' throw TypeError when 'linewidths' property is set - matplotlib

When the 'linewidths' property is set, calling 'savefig' throws 'TypeError: cannot perform reduce with flexible type'. Here is a MWE:
# Create sample data.
x = np.arange(-3.0, 3.0, 0.1)
y = np.arange(-2.0, 2.0, 0.1)
X, Y = np.meshgrid(x, y)
Z = 10.0 * (2*X - Y)
# Plot sample data.
plt.contour(X, Y, Z, colors = 'black', linewidths = '1')
plt.savefig('test.pdf')
It is not a problem with the figure rendering (calling 'plt.show()' works fine). If the linewidths property is not set, e.g. changing the second last line to
plt.contour(X, Y, Z, colors = 'black')
'savefig' works as intended. Is this a bug or have i missed something?

This is not a bug, since the documentation for plt.contour() specifies that linewidths should be a [ None | number | tuple of numbers ] while you provide a number as a string.
Here is my output with your code (I am using matplotlib 1.4.3).
>>> matplotlib.__version__
'1.4.3'
Your code 'works' under Python 2.7 but the linewidths parameter is effectively ignored, producing plots that look like this, regardless of the value (this was with linewidths='10'.
In contrast on Python 3.4 I get the following error:
TypeError: unorderable types: str() > int()
Setting linewidths to an int (or a float) as follows produces the correct output and works on both Python 2.7 and Python 3.4. Again, this is with it set to 10:
plt.contour(X, Y, Z, colors = 'black', linewidths = 10)

Related

How to calculate maximum value of a curve in matplotlib and assign that value (upto two digits after decimal) to the curve with arrow

From a data file in hdf5 format I am plotting three lines using
if (nbnd<3):
color=f'C{nbnd}'
ax.plot(x, y, c=color, lw=2.0, alpha=0.8, label = lbl[nbnd] if nbnd < 3 and i == 0 else None)
I want to grep maximum value of nbnd=2 along Y-axis and write that value on top of the line (corresponding to nbnd=2) with an arrow.
A sample is shown in the plot where 0.032 with arrow is manually done while I need this kind of notation in matplotlib automatically.
A similar accepted answer is given here: https://stackoverflow.com/questions/43374920/how-to-automatically-annotate-maximum-value-in-pyplot [https://i.stack.imgur.com/TPwp7.png]
But I could not get it done for my work.
I used
ymax = max(y)
xpos = y.index(ymax)
xmax = x[xpos]
ax4.annotate('local max', xy=(xmax, ymax), xytext=(xmax, ymax+5),
arrowprops=dict(facecolor='black', shrink=0.05),
)
It gives me below error:
ax4.annotate('local max', xy=(xmax, ymax), xytext=(xmax, ymax+5),
NameError: name 'xmax' is not defined

Represent a first order differential equation in numpy

I have an equation dy/dx = x + y/5 and an initial value, y(0) = -3.
I would like to know how to plot the exact graph of this function using pyplot.
I also have a x = np.linspace(0, interval, steps+1) which I would like to use as the x axis. So I'm only looking for the y axis values.
Thanks in advance.
Just for completeness, this kind of equation can easily be integrated numerically, using scipy.integrate.odeint.
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# function dy/dx = x + y/5.
func = lambda y,x : x + y/5.
# Initial condition
y0 = -3 # at x=0
# values at which to compute the solution (needs to start at x=0)
x = np.linspace(0, 4, 101)
# solution
y = odeint(func, y0, x)
# plot the solution, note that y is a column vector
plt.plot(x, y[:,0])
plt.xlabel('x')
plt.ylabel('y')
plt.show()
Given that you need to solve the d.e. you might prefer doing this algebraically, with sympy. (Or you might not.)
Import the module and define the function and the dependent variable.
>>> from sympy import *
>>> f = Function('f')
>>> var('x')
x
Invoke the solver. Note that all terms of the d.e. must be transposed to the left of the equals sign, and that the y must be replaced by the designator for the function.
>>> dsolve(Derivative(f(x),x)-x-f(x)/5)
Eq(f(x), (C1 + 5*(-x - 5)*exp(-x/5))*exp(x/5))
As you would expect, the solution is given in terms of an arbitrary constant. We must solve for that using the initial value. We define it as a sympy variable.
>>> var('C1')
C1
Now we create an expression to represent this arbitrary constant as the left side of an equation that we can solve. We replace f(0) with its value in the initial condition. Then we substitute the value of x in that condition to get an equation in C1.
>>> expr = -3 - ( (C1 + 5*(-x - 5)*exp(-x/5))*exp(x/5) )
>>> expr.subs(x,0)
-C1 + 22
In other words, C1 = 22. Finally, we can use this value to obtain the particular solution of the differential equation.
>>> ((C1 + 5*(-x - 5)*exp(-x/5))*exp(x/5)).subs(C1,22)
((-5*x - 25)*exp(-x/5) + 22)*exp(x/5)
Because I'm absentminded and ever fearful of making egregious mistakes I check that this function satisfies the initial condition.
>>> (((-5*x - 25)*exp(-x/5) + 22)*exp(x/5)).subs(x,0)
-3
(Usually things are incorrect only when I forget to check them. Such is life.)
And I can plot this in sympy too.
>>> plot(((-5*x - 25)*exp(-x/5) + 22)*exp(x/5),(x,-1,5))
<sympy.plotting.plot.Plot object at 0x0000000008C2F780>

Python Memory error on scipy stats. Scipy linalg lstsq <> manual beta

Not sure if this question belongs here or on crossvalidated but since the primary issue is programming language related, I am posting it here.
Inputs:
Y= big 2D numpy array (300000,30)
X= 1D array (30,)
Desired Output:
B= 1D array (300000,) each element of which regression coefficient of regressing each row (element of length 30) of Y against X
So B[0] = scipy.stats.linregress(X,Y[0])[0]
I tried this first:
B = scipy.stats.linregress(X,Y)[0]
hoping that it will broadcast X according to shape of Y. Next I broadcast X myself to match the shape of Y. But on both occasions, I got this error:
File "C:\...\scipy\stats\stats.py", line 3011, in linregress
ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat
File "C:\...\numpy\lib\function_base.py", line 1766, in cov
return (dot(X, X.T.conj()) / fact).squeeze()
MemoryError
I used manual approach to calculate beta, and on Sascha's suggestion below also used scipy.linalg.lstsq as follows
B = lstsq(Y.T, X)[0] # first estimate of beta
Y1=Y-Y.mean(1)[:,None]
X1=X-X.mean()
B1= np.dot(Y1,X1)/np.dot(X1,X1) # second estimate of beta
The two estimates of beta are very different however:
>>> B1
Out[10]: array([0.135623, 0.028919, -0.106278, ..., -0.467340, -0.549543, -0.498500])
>>> B
Out[11]: array([0.000014, -0.000073, -0.000058, ..., 0.000002, -0.000000, 0.000001])
Scipy's linregress will output slope+intercept which defines the regression-line.
If you want to access the coefficients naturally, scipy's lstsq might be more appropriate, which is an equivalent formulation.
Of course you need to feed it with the correct dimensions (your data is not ready; needs preprocessing; swap dims).
Code
import numpy as np
from scipy.linalg import lstsq
Y = np.random.random((300000,30))
X = np.random.random(30)
x, res, rank, s = lstsq(Y.T, X) # Y transposed!
print(x)
print(x.shape)
Output
[ 1.73122781e-05 2.70274135e-05 9.80840639e-06 ..., -1.84597771e-05
5.25035470e-07 2.41275026e-05]
(300000,)

Error with sympy.lambdify for piecewise functions and numpy module

In sympy 0.7.6, I had no troubles with the following code for both the modules='sympy' and the modules='numpy' options. Now with sympy v0.1, the evaluation with modules='numpy' raise a ZeroDivisionError:
import sympy
x, y = sympy.symbols(['x', 'y'])
expr = sympy.Piecewise((1/x, y < -1), (x, y <= 1), (1/x, True))
f_sympy = sympy.lambdify([x, y], expr, modules='sympy')
f_numpy = sympy.lambdify([x, y], expr, modules='numpy')
print f_sympy(0, 1) # performs well
print f_numpy(0, 1) # issue: ZeroDivisionError
Seems like the piecewise functions evaluate before the condition with modules='numpy'.
My questions are:
Is this behavior normal?
If so, why, and how to define a piecewise expression and evaluate it as fast as with numpy module without the sympy.lambdify procedure?
EDIT:
Found that in my case the solution is theano:
import sympy
x, y = sympy.symbols(['x', 'y'])
f = sympy.Piecewise((1/x, y < -1), (x, y <= 1), (1/x, True))
from sympy.printing.theanocode import theano_function
f_theano = theano_function([x, y], [f])
print f_theano(0, 1) # OK, return 0
I deleted my other answer (in case you already saw it). There is a much simpler solution.
The ZeroDivisionError comes because the lambdified expression produces, roughly, lambda x, y: select([less(y, -1),less_equal(y, 1),True], [1/x,x,1/x], default=nan). The problem is that passing in x = 0 results in 1/0 being evaluated by Python, which raises the error.
But NumPy is just fine with dividing by zero. It will issue a warning, but otherwise works fine (it gives inf), and in this example there is no problem, because the inf is not actually used.
So the solution is to wrap the input to lambdify as numpy arrays, that is, instead of
f_numpy(0, 1)
use
f_numpy(array(0), array(1))
There is a SymPy issue discussing this if you are interested.

inverse of FFT not the same as original function

I don't understand why the ifft(fft(myFunction)) is not the same as my function. It seems to be the same shape but a factor of 2 out (ignoring the constant y-offset). All the documentation I can see says there is some normalisation that fft doesn't do, but that ifft should take care of that. Here's some example code below - you can see where I've bodged the factor of 2 to give me the right answer. Thanks for any help - its driving me nuts.
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:] = 0
# make new series
y2 = fftp.ifft(myfft).real
# find constant y offset
myfft[1:]=0
c = fftp.ifft(myfft)[0]
# remove c, apply factor of 2 and re apply c
y2 = (y2-c)*2 + c
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
You're removing half the spectrum when you do myfft[wn:] = 0. The negative frequencies are those in the top half of the array and are required.
You have a second fudge to get your results which is taking the real part to find y2: y2 = fftp.ifft(myfft).real (fftp.ifft(myfft) has a non-negligible imaginary part due to the asymmetry in the spectrum).
Fix it with myfft[wn:-wn] = 0 instead of myfft[wn:] = 0, and remove the fudges. So the fixed code looks something like:
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:-wn] = 0
# make new series
y2 = fftp.ifft(myfft)
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
It's really worth paying attention to the interim arrays that you are creating when trying to do signal processing. Invariably, there are clues as to what is going wrong that should direct you to the problem. In this case, you taking the real part masked the problem and made your task more difficult.
Just to add another quick point: Sometimes taking the real part of the resultant array is exactly the correct thing to do. It's often the case that you end up with an imaginary part to the signal output which is just down to numerical errors in the input to the inverse FFT. Typically this manifests itself as very small imaginary values, so taking the real part is basically the same array.
You are killing the negative frequencies between 0 and -wn.
I think what you mean to do is to set myfft to 0 for all frequencies outside [-wn, wn].
Change the following line:
myfft[wn:] = 0
to:
myfft[wn:-wn] = 0