How to set 'y > 0' formula in set_xlim of matplotlib? - pandas

I want to set x range according to y value in plotting graph such as y > 0 but I'm not sure how to set this one. Could you let me know how to set it?
df = pd.read_csv(file.csv)
x = np.array(df1['A'])
y = np.array(df1['B'])
z = np.array(df1['C'])
x_for_ax1 = np.ma.masked_where((y < 0) | (y > 100), x)
fig, (ax2, ax1) = plt.subplots(ncols=1, nrows=2)
# range of ax1.set_xlim and ax1.set_xlim is same.
ax1.set_ylim([-10, 40])
ax2.set_ylim([-5, 5])
ax1.set_xlim([x_for_ax1.min(), x_for_ax1.max()])
ax2.set_xlim([x_for_ax1.min(), x_for_ax1.max()])

If you want to set the x-limits to the range of the y-axis, you can use a masked array and get its minimum and maximum.
In the example below, at the left both subplots get the x-limits where either y or z are in range. At the right, each subplot only gets the x-range where its corresponding y is in range.
For demonstration purposes, the example creates a data frame from some dummy data.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
a = np.linspace(-1, 4, 500)
b = np.sin(a) * 100
c = np.cos(a) * 150
df = pd.DataFrame({'A': a, 'B': b, 'C': c})
x = np.array(df['A'])
y = np.array(df['B'])
z = np.array(df['C'])
fig, ((ax1, ax3),(ax2, ax4)) = plt.subplots(ncols=2, nrows=2)
ax1.set_xlabel('x')
ax2.set_xlabel('x')
ax3.set_xlabel('x')
ax4.set_xlabel('x')
ax1.set_ylabel('y')
ax3.set_ylabel('y')
ax2.set_ylabel('z')
ax4.set_ylabel('z')
ymin = 1
ymax = 100
zmin = 1
zmax = 150
x_for_ax1 = np.ma.masked_where(((y < ymin) | (y > ymax)) & ((z < zmin) | (z > zmax)), x)
x_for_ax3 = np.ma.masked_where((y < ymin) | (y > ymax), x)
x_for_ax4 = np.ma.masked_where((z < zmin) | (z > zmax), x)
ax1.plot(x, y)
ax3.plot(x, y)
ax1.set_ylim([ymin, ymax])
ax3.set_ylim([ymin, ymax])
ax2.plot(x, z)
ax4.plot(x, z)
ax2.set_ylim([zmin, zmax])
ax4.set_ylim([zmin, zmax])
ax1.set_xlim([x_for_ax1.min(), x_for_ax1.max()])
ax2.set_xlim([x_for_ax1.min(), x_for_ax1.max()])
ax1.set_title('x limited to y and z range')
ax2.set_title('x limited to y and z range')
ax3.set_xlim([x_for_ax3.min(), x_for_ax3.max()])
ax3.set_title('x limited to y range')
ax4.set_xlim([x_for_ax4.min(), x_for_ax4.max()])
ax4.set_title('x limited to z range')
plt.tight_layout(w_pad=1)
plt.show()

Related

centre the peak at x=0

Right now the rectangle signal is centre on x = 4, how can I make it centre on x = 0
def rect(n,T):
a = np.zeros(int((n-T)/2,))
b = np.ones((T,))
c= np.zeros(int((n-T)/2,))
a1 = np.append(a,b)
a2 = np.append(a1,c)
return a2
x =rect(11,6)
plt.step(x, 'r')
plt.show()
This is so far that I wrote. Appreciate anyone can give the Idea
A method to center the rectangle at x=0 is to provide x values to plt.step. One way to accomplish this is to use numpy arange and center the x values around 0 by using the length of a2 returned in the rects function
# Changed to y because it will be our y values in plt.step
y = rect(11, 6)
# Add 0.5 so it's centered
x = np.arange(-len(y)/2 + 0.5, len(y)/2 + 0.5)
And then plot it using plt.step and setting where to mid (more info in the plt.step docs):
plt.step(x, y, where='mid', color='r')
Hope this helps. Here is the full code:
import numpy as np
import matplotlib.pyplot as plt
def rect(n, T):
a = np.zeros(int((n-T)/2,))
b = np.ones((T,))
c = np.zeros(int((n-T)/2,))
a1 = np.append(a, b)
a2 = np.append(a1, c)
return a2
y = rect(11, 6)
# Add 0.5 so it's centered
x = np.arange(-len(y)/2 + 0.5, len(y)/2 + 0.5)
plt.step(x, y, where='mid', color='r')
plt.show()

Mask values inside given path (triangle, square etc) for a contourf plot

I am trying to mask specific locations (triangles, squares) for a contourf plot. I can do the mask based on the Z values but finding it difficult to get it work based on x and y values. For the MWE below, I want to create a mask between given X,Y values (triangle or square). Lets say for the example below, I want to mask values inside triangle formed between points (0,0),(2,0),(0,2). I want to basically be able to provide an enclosed path and mask everything in between those values. I have tried the approach here but I have to provide the logic for individual X and Y values which becomes cumbersome for a complicated path.
import numpy as np
import matplotlib.pyplot as plt
origin = 'lower'
delta = 0.025
x = y = np.arange(-3.0, 3.01, delta)
X, Y = np.meshgrid(x, y)
Z1 = np.exp(-X**2 - Y**2)
Z2 = np.exp(-(X - 1)**2 - (Y - 1)**2)
Z = (Z1 - Z2) * 2
fig1, ax2 = plt.subplots(constrained_layout=True)
CS = ax2.contourf(X, Y, Z, 10, cmap=plt.cm.viridis, origin=origin,extend='both')
ax2.set_title('Random Plot')
ax2.set_xlabel('X Axis')
ax2.set_ylabel('Y Axis')
cbar = fig1.colorbar(CS)
A convex shape such as a triangle can be defined by the equations of the lines going through their vertices. In this case the equations are quite simple: X >= 0 is the zone right of the line through 0,0 and 0,2. Similar Y >= 0 and X + Y <= 2 are the two other zones. The triangle is the intersection of these 3 zones.
Setting the corresponding Z values to NaN will create the empty triangle in the contour plot.
import numpy as np
import matplotlib.pyplot as plt
delta = 0.025
x = y = np.arange(-3.0, 3.01, delta)
X, Y = np.meshgrid(x, y)
Z1 = np.exp(-X ** 2 - Y ** 2)
Z2 = np.exp(-(X - 1) ** 2 - (Y - 1) ** 2)
Z = (Z1 - Z2) * 2
Z[(X >= 0) & (Y >= 0) & (X + Y <= 2)] = np.nan
fig1, ax2 = plt.subplots()
CS = ax2.contourf(X, Y, Z, 10, cmap=plt.cm.viridis, origin='lower', extend='both')
ax2.set_title('Random Plot missing a triangle')
ax2.set_xlabel('X Axis')
ax2.set_ylabel('Y Axis')
cbar = fig1.colorbar(CS)
plt.show()
PS: The equation of a line through two points x1,y1 and x2,y2 is
(X - x1) * (y2 - y1) - (Y - y1) * (x2 - x1) == 0
So, a more general code could look like:
def line_eq(X, Y, p1, p2):
x1, y1 = p1
x2, y2 = p2
return (X - x1) * (y2 - y1) - (Y - y1) * (x2 - x1) >= 0
p = [(0, 0), (0, 2), (2, 0)] # clockwise ordering
Z[line_eq(X, Y, p[0], p[1]) & line_eq(X, Y, p[1], p[2]) & line_eq(X, Y, p[2], p[0])] = np.nan
Note that when the vertices are ordered counterclockwise, the equation should be <= 0 to grab the interior convex shape.
Concave shapes can be created by taking the union (logical or) of several convex shapes:
def line_eq(X, Y, p1, p2):
x1, y1 = p1
x2, y2 = p2
return (X - x1) * (y2 - y1) - (Y - y1) * (x2 - x1) >= 0
def convex_eq(X, Y, p):
mask = line_eq(X, Y, p[-1], p[0])
for p1, p2 in zip(p[:-1], p[1:]):
mask &= line_eq(X, Y, p1, p2)
return mask
def multiple_convex_eq(X, Y, c):
mask = convex_eq(X, Y, c[0])
for ci in c:
mask |= convex_eq(X, Y, ci)
return mask
p = [(0, 2.5), (1.5, 1), (1, -2), (-1, -2), (-1.5, 1)] # pentagon, clockwise ordering
five_trianggles = [[(0, 0), p1, p2] for p1, p2 in zip(p, (p + p)[2:])]
Z[multiple_convex_eq(X, Y, five_trianggles)] = np.nan

Binning data and plotting

I have a dataframe of essentially random numbers, (except for one column), some of which are NaNs. MWE:
import numpy as np
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import pandas as pd
randomNumberGenerator = np.random.RandomState(1000)
z = 5 * randomNumberGenerator.rand(101)
A = 4 * z - 3+ randomNumberGenerator.randn(101)
B = 4 * z - 2+ randomNumberGenerator.randn(101)
C = 4 * z - 1+ randomNumberGenerator.randn(101)
D = 4 * z - 4+ randomNumberGenerator.randn(101)
A[50] = np.nan
A[:3] = np.nan
B[12:20] = np.nan
sources= pd.DataFrame({'z': z})
sources['A'] = A
sources['B'] = B
sources['C'] = C
sources['D'] = D
#sources= sources.dropna()
x = sources.z
y1 = sources.A
y2 = sources.B
y3 = sources.C
y4 = sources.D
for i in [y1, y2, y3, y4]:
count = np.count_nonzero(~np.logical_or(np.isnan(x), np.isnan(i)))
label = 'Points plotted: %d'%count
plt.scatter(x, i, label = label)
plt.legend()
I need to bin the data according to x and plot different columns in each bin, in 3 side-by-side subplots:
x_1 <= 1 plot A-B | 1 < x_2 < 3 plot B+C | 3 < x_3 plot C-D
I've tried to bin the data with
x1 = sources[sources['z']<1] # z < 1
x2 = sources[sources['z']<3]
x2 = x2[x2['z']>=1] # 1<= z < 3
x3 = sources[sources['z']<max(z)]
x3 = x3[x3['z']>=3] # 3 <= z <= max(z)
x1 = x1['z']
x2 = x2['z']
x3 = x3['z']
but there's got to be a better way to go about it. What's the best way to produce something like this?
For binning in pandas is used cut, so solution is:
sources= pd.DataFrame({'z': z})
sources['A'] = A
sources['B'] = B
sources['C'] = C
sources['D'] = D
#sources= sources.dropna()
bins = pd.cut(sources['z'], [-np.inf, 1, 3, max(z)], labels=[1,2,3])
m1 = bins == 1
m2 = bins == 2
m3 = bins == 3
x11 = sources.loc[m1, 'A']
x12 = sources.loc[m1, 'B']
x21 = sources.loc[m2, 'B']
x22 = sources.loc[m2, 'C']
x31 = sources.loc[m3, 'C']
x32 = sources.loc[m3, 'D']
y11 = sources.loc[m1, 'A']
y12 = sources.loc[m1, 'B']
y21 = sources.loc[m2, 'B']
y22 = sources.loc[m2, 'C']
y31 = sources.loc[m3, 'C']
y32 = sources.loc[m3, 'D']
tups = [(x11, x12, y11, y12), (x21, x22,y21, y22),(x31, x32, y31, y32)]
fig, ax = plt.subplots(1,3)
ax = ax.flatten()
for k, (i1, i2, j1, j2) in enumerate(tups):
count1 = np.count_nonzero(~np.logical_or(np.isnan(i1), np.isnan(j1)))
count2 = np.count_nonzero(~np.logical_or(np.isnan(i2), np.isnan(j2)))
label1 = 'Points plotted: %d'%count1
label2 = 'Points plotted: %d'%count2
ax[k].scatter(i1, j1, label = label1)
ax[k].scatter(i2, j2, label = label2)
ax[k].legend()

plot a line in 3D plot in julia

I'm trying to plot a line segment between the points [1,1] and [0,0] in the surface Z function x^2 + y^2,
i've already plotted f with:
using PyPlot
using Distributions
function f(x)
return (x[1]^2 + x[2]^2)
#return sin(x[1]) + cos(x[2])
end
n = 100
x = linspace(-1, 1, n)
y = linspace(-1,1,n)
xgrid = repmat(x',n,1)
ygrid = repmat(y,1,n)
z = zeros(n,n)
for i in 1:n
for j in 1:n
z[i:i,j:j] = f([x[i],y[j]])
end
end
plot_wireframe(xgrid,ygrid,z)
I know already about R (ggplot2) and C, but i'm new with python and julia librarys like matlibplot
well, I've just had to make:
using PyPlot
using Distributions
function f(x)
return (x[1]^2 + x[2]^2)
#return sin(x[1]) + cos(x[2])
end
n = 100
x = linspace(-1, 1, n)
y = linspace(-1,1,n)
xgrid = repmat(x',n,1)
ygrid = repmat(y,1,n)
z = zeros(n,n)
for i in 1:n
for j in 1:n
z[i:i,j:j] = f([x[i],y[j]])
end
end
plot_wireframe(xgrid,ygrid,z)
## new line
plot([0.0, 1.0, -1.0], [0.0, 1.0, 1.0], [0.0 , 2.0, 2.0], color="red")

Printing the equation of the best fit line

I have created the best fit lines for the dataset using the following code:
fig, ax = plt.subplots()
for dd,KK in DATASET.groupby('Z'):
fit = polyfit(x,y,3)
fit_fn = poly1d(fit)
ax.plot(KK['x'],KK['y'],'o',KK['x'], fit_fn(KK['x']),'k',linewidth=4)
ax.set_xlabel('x')
ax.set_ylabel('y')
The graph displays the best fit line for each group of Z. I want print the equation of the best fit line on top of the line.Please suggest what can i do out here
So you need to write some function that convert a poly parameters array to a latex string, here is an example:
import pylab as pl
import numpy as np
x = np.random.randn(100)
y = 1 + 2 * x + 3 * x * x + np.random.randn(100) * 2
poly = pl.polyfit(x, y, 2)
def poly2latex(poly, variable="x", width=2):
t = ["{0:0.{width}f}"]
t.append(t[-1] + " {variable}")
t.append(t[-1] + "^{1}")
def f():
for i, v in enumerate(reversed(poly)):
idx = i if i < 2 else 2
yield t[idx].format(v, i, variable=variable, width=width)
return "${}$".format("+".join(f()))
pl.plot(x, y, "o", alpha=0.4)
x2 = np.linspace(-2, 2, 100)
y2 = np.polyval(poly, x2)
pl.plot(x2, y2, lw=2, color="r")
pl.text(x2[5], y2[5], poly2latex(poly), fontsize=16)
Here is the output:
Here's a one liner.
If fit is the poly1d object, while plotting the fitted line, just use label argument as bellow,
label='y=${}$'.format(''.join(['{}x^{}'.format(('{:.2f}'.format(j) if j<0 else '+{:.2f}'.format(j)),(len(fit.coef)-i-1)) for i,j in enumerate(fit.coef)]))