group boxplot histogramming - matplotlib

I would like to group my data and to plot the boxplot for all the groups. There are many questions and answer about that, my problem is that I want to group by a continuos variable, so I want to histogramming my data.
Here what I have done. My data:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
x = np.random.chisquare(5, size=100000)
y = np.random.normal(size=100000) / (0.05 * x + 0.1) + 2 * x
f, ax = plt.subplots()
ax.plot(x, y, '.', alpha=0.05)
plt.show()
I want to study the behaviour of y (location, width, ...) as a function of x. I am not interested in the distribution of x so I will normalized it.
f, ax = plt.subplots()
xbins = np.linspace(0, 25, 50)
ybins = np.linspace(-20, 50, 50)
H, xedges, yedges = np.histogram2d(y, x, bins=(ybins, xbins))
norm = np.sum(H, axis = 0)
H /= norm
ax.pcolor(xbins, ybins, np.nan_to_num(H), vmax=.4)
plt.show()
I can plot histogram, but I want boxplot
binning = np.concatenate(([0], np.sort(np.random.random(20) * 25), [25]))
idx = np.digitize(x, binning)
data_to_plot = [y[idx == i] for i in xrange(len(binning))]
f, ax = plt.subplots()
midpoints = 0.5 * (binning[1:] + binning[:-1])
widths = 0.9 * (binning[1:] - binning[:-1])
from matplotlib.ticker import MultipleLocator, FormatStrFormatter
majorLocator = MultipleLocator(2)
ax.boxplot(data_to_plot, positions = midpoints, widths=widths)
ax.set_xlim(0, 25)
ax.xaxis.set_major_locator(majorLocator)
ax.set_xlabel('x')
ax.set_ylabel('median(y)')
plt.show()
Is there an automatic way to do that, like ax.magic(x, y, binning)? Is there a better way to do that? (Have a look to https://root.cern.ch/root/html/TProfile.html for example, which plot the mean and the error of the mean as error bars)
In addition, I want to minize the memory footprint (my real data are much more than 100000), I am worried about data_to_plot, is it a copy?

Related

Surface Plot of a function B(x,y,z)

I have to plot a surface plot which has axes x,y,z and a colormap set by a function of x,y,z [B(x,y,z)].
I have the coordinate arrays:
x=np.arange(-100,100,1)
y=np.arange(-100,100,1)
z=np.arange(-100,100,1)
Moreover, my to-be-colormap function B(x,y,z) is a 3D array, whose B(x,y,z)[i] elements are the (x,y) coordinates at z.
I have tried something like:
Z,X,Y=np.meshgrid(z,x,y) # Z is the first one since B(x,y,z)[i] are the (x,y) coordinates at z.
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
img = ax.scatter(Z, X, Y, c=B(x,y,z), cmap=plt.hot())
fig.colorbar(img)
plt.show()
However, it unsurprisingly plots dots, which is not what I want. Rather, I need a surface plot.
The figure I have obtained:
The kind of figure I want:
where the colors are determined by B(x,y,z) for my case.
You have to:
use plot_surface to create a surface plot.
your function B(x, y, z) will be used to compute the color parameter, a number assigned to each face of the surface.
the color parameter must be normalized between 0, 1. We use matplotlib's Normalize to achieve that.
then, you create the colors by applying the colormap to the normalized color parameter.
finally, you create the plot.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from matplotlib.colors import Normalize
t = np.linspace(0, 2*np.pi)
p = np.linspace(0, 2*np.pi)
t, p = np.meshgrid(t, p)
r1, r2 = 1, 3
x = (r2 + r1 * np.cos(t)) * np.cos(p)
y = (r2 + r1 * np.cos(t)) * np.sin(p)
z = r1 * np.sin(t)
color_param = np.sin(x / 2) * np.cos(y) + z
cmap = cm.jet
norm = Normalize(vmin=color_param.min(), vmax=color_param.max())
norm_color_param = norm(color_param)
colors = cmap(norm_color_param)
fig = plt.figure()
ax = fig.add_subplot(projection="3d")
ax.plot_surface(x, y, z, facecolors=colors)
ax.set_zlim(-4, 4)
plt.show()

To make value interpolation on cylindrical surface

I have a issue to interpolate my values "c" on cylindrical surface.
The problem is that possibly I dont understand how to indicate surface for gridding with gridddata function..
>import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata as gd
#Creating data in polar coordinates
phi,d = np.linspace(0, 2* np.pi, 20), np.linspace(0,20,20)
PHI,D = np.meshgrid(phi,d)
R = 2
#Transforming in X Y Z coordinates
X = R * np.cos(PHI)
Y = R * np.sin(PHI)
Z = R * D
T=np.linspace(0,10,400)
c=np.sin(T)*np.cos(T/2) #Value c I would like to interpolate
fig1 = plt.figure()
ax = fig1.add_subplot(1,1,1, projection='3d')
xi=np.array(np.meshgrid(X,Y,Z))
img = ax.scatter(X, Y, Z,c=c, cmap=plt.hot()) #To plot data scatter before interpolation
fig1.colorbar(img)
plt.show()
X1,Y1,Z1 =np.meshgrid(X ,Y ,Z) #To define sufrace for interpolation
int = gd((X,Y,Z), c, (X1,Y1,Z1), method='linear')
fig2 = plt.figure() #trying to plot the answer
ax1 = fig2.add_subplot(1,1,1, projection='3d')
ax1.scatter(int)
img = ax1.scatter(X, Y, Z, c=c, cmap=plt.hot())
`
Its gives error: different number of values and points
I dont know how to indicate (X1,Y1,Z1) surface in griddata function
Thanks a lot for any tips ...

Matplotlib 3d barplot failing to draw just one face

import numpy as np
import matplotlib.pyplot as plt
x, y = np.array([[x, y] for x in range(5) for y in range(x+1)]).T
z = 1/ (5*x + 5)
fig = plt.figure()
ax = fig.gca(projection = '3d')
ax.bar3d(x, y, np.zeros_like(z), dx = 1, dy = 1, dz = z)
yields
How do I get the face at (1,0) to display properly?
There is currently no good solution to this. Fortunately though, it happens only for some viewing angles. So you can choose an angle where it plots fine, e.g.
ax.view_init(azim=-60, elev=25)

pyplot: loglog contour with labels fix angle

This is an extension of a related question.
I intend to make a contour plot, with labeled contours, then change the axes scales to 'log'.
This works fine except that the rotation of the contour labels is not adjusted. Can this be fixed?
loglog = False
import matplotlib.pyplot as plt
import numpy as np
x = (np.linspace(0, 10))
y = (np.linspace(0, 10))
X, Y = np.meshgrid(x, y)
C = plt.contour(X, Y, np.sqrt(X) * Y)
plt.clabel(C, inline=1, fontsize=10)
plt.xlim(1, 10)
plt.ylim(1, 10)
if loglog: plt.xscale('log')
if loglog: plt.yscale('log')
plt.show()
The fist plot is obtained with loglog=False in the second loglog=True:
So the answer is actually obvious. Changing the the axes scale types in advance helps, of course.
Edit:
I think it makes sense to use logspace instead of linspace here.
import matplotlib.pyplot as plt
import numpy as np
x = np.logspace(0, 1, 100, base=10)
y = np.logspace(0, 1, 100, base=10)
X, Y = np.meshgrid(x, y)
plt.xlim(1, 10)
plt.ylim(1, 10)
plt.xscale('log')
plt.yscale('log')
C = plt.contour(X, Y, np.sqrt(X) * Y)
plt.clabel(C, inline=1, fontsize=10)

matplotlib polar 2d histogram

I am trying to plot some histogrammed data on a polar axis but it wont seem to work properly. An example is below, I use the custom projection found How to make the angles in a matplotlib polar plot go clockwise with 0° at the top? it works for a scatter plot so I think my problem is with the histogram function. This has been driving me nuts all day, does anyone know what I am doing wrong...........
import random
import numpy as np
import matplotlib.pyplot as plt
baz = np.zeros((20))
freq = np.zeros((20))
pwr = np.zeros((20))
for x in range(20):
baz[x] = random.randint(20,25)*10
freq[x] = random.randint(1,10)*10
pwr[x] = random.randint(-10,-1)*10
baz = baz*np.pi/180.
abins = np.linspace(0,2*np.pi,360) # 0 to 360 in steps of 360/N.
sbins = np.linspace(1, 100)
H, xedges, yedges = np.histogram2d(baz, freq, bins=(abins,sbins), weights=pwr)
plt.figure(figsize=(14,14))
plt.subplot(1, 1, 1, projection='northpolar')
#plt.scatter(baz, freq)
plt.pcolormesh(H)
plt.show()
Your code works if you explicitly pass a mgrid (with similar characteristics than your a bins and sbins) to the pcolormesh command.
Below is an example inspired by your code:
import matplotlib.pyplot as plt
import numpy as np
#Generate the data
size = 200
baz = 10*np.random.randint(20, 25, size)*np.pi/180.
freq = 10*np.random.randint(1, 10, size)
pwr = 10*np.random.randint(-10, -1, size)
abins = np.linspace(0, 2*np.pi, 360) # 0 to 360 in steps of 360/N.
sbins = np.linspace(1, 100, 50)
H, xedges, yedges = np.histogram2d(baz, freq, bins=(abins,sbins), weights=pwr)
#Grid to plot your data on using pcolormesh
theta, r = np.mgrid[0:2*np.pi:360j, 1:100:50j]
fig, ax = plt.subplots(figsize=(14,14), subplot_kw=dict(projection='northpolar'))
ax.pcolormesh(theta, r, H)
ax.set_yticklabels([]) #remove yticklabels
plt.show()