Subplots with counter like legends - pandas

I have written plot_dataframe() to create two subplots (one for line chart and another for histogram bar chart) for a dataframe that is passed via argument.
Then I call this function from plot_kernels() with multiple dataframs.
def plot_dataframe(df, cnt):
row = df.iloc[0].astype(int) # First row in the dataframe
plt.subplot(2, 1, 1)
row.plot(legend=cnt) # Line chart
plt.subplot(2, 1, 2)
df2 = row.value_counts()
df2.reindex().plot(kind='bar', legend=cnt) # Histogram
def plot_kernels(mydict2):
plt.figure(figsize=(20, 15))
cnt=1
for key in my_dict2:
df = my_dict2[key]
plot_dataframe(df, cnt)
cnt = cnt + 1
plt.show()
The dictionary looks like
{'K1::foo(bar::z(x,u))': Value Value
0 10 2
1 5 2
2 10 2, 'K3::foo(bar::y(z,u))': Value Value
0 6 12
1 7 13
2 8 14}
And based on the values in row[0], [10,2] are shown in blue line and [6,12] are shown in orange line. For histogram, they are similar. As you can see the legends in the subplots are shown as 0 in the figure. I expect to see 1 and 2. How can I fix that?

Change legend to label, then force the legend after you plot everything:
def plot_dataframe(df, cnt,axes):
row = df.iloc[0].astype(int) # First row in the dataframe
row.plot(label=cnt, ax=axes[0]) # Line chart -- use label, not legend
df2 = row.value_counts()
df2.plot(kind='bar', ax=axes[1], label=cnt) # Histogram
def plot_kernels(d):
# I'd create the axes first and pass to the plot function
fig,axes = plt.subplots(2,1, figsize=(20, 15))
cnt=1
for key in d:
df = d[key]
plot_dataframe(df, cnt, axes=axes)
cnt = cnt + 1
# render the legend
for ax in axes:
ax.legend()
plt.show()
Output:

Related

Matplotlib subplot using nested for loop to plot timeseries (trajectory)

I want to plot 5 timetable series using nested for loop. here is my code attached and the results of the plots. I use the first loop to generate each when I put plt.show() outside the first loop, it will just plot the fifth series, and when I put the plt.show() outside the inside (second) loop, it will just plot the first series.
How can I plot all five series using the nested loop?
How can I plot all of the same y variables with the same bound (shared y) in the loop?
fig = plt.figure(figsize=(38, 55))
for i in range(len(Traj_List)): # first loop for creating subplots for each of timeseries data (Trajectory of some states like joint angle and joint speed and muscle activations)
# States.
stateNames = list(Traj_List[i].getStateNames()) # Traj_List is a list of timeseries data
numStates = len(stateNames)
dim = np.sqrt(numStates)
if dim == np.ceil(dim):
numRows = int(dim)
numCols = int(dim)
else:
numCols = int(np.min([numStates, 4]))
numRows = int(np.floor(numStates / 4))
if not numStates % 4 == 0:
numRows += 1
# color = iter(plt.rainbow(np.linspace(0, 1, 5)))
color = ['r', 'b', 'g', 'y', 'm']
lines = ["-", "--", "-.", ":", "-."]
linewidth = [3, 2.5, 3.5, 2, 3]
for j in np.arange(numStates): # the second loop plots each time series data (states) against time.
ax = fig.add_subplot(numRows, numCols, int(j + 1))
ax.plot(Traj_List[i].getTimeMat(),
Traj_List[i].getStateMat(stateNames[j]), linestyle=lines[i], color=color[i],
linewidth=linewidth[i], label=Label_List[i])
stateName = stateNames[j]
ax.set_title(stateName)
ax.set_xlabel('time (s)')
# ax.set_xlim(0, 1)
ax.legend(loc='best')
if 'value' in stateName:
ax.set_ylabel('position (rad)')
elif 'speed' in stateName:
ax.set_ylabel('speed (rad/s)')
elif 'activation' in stateName:
ax.set_ylabel('activation (-)')
ax.set_ylim(0, 1)
# plt.show()
fig.tight_layout()
plt.show()
plt.close()
Try to only make the figure once, and save the axes. In pseudo code, because I can't run your code anyway:
for i in range(N):
M, N = size(data)
if i == 0:
# only make the axes once!
fig, axs = plt.subplots(M, N)
for j in range(M*N):
axs.flat[j].plot(yourdata)
I'm sure my ranges are not correct for your data, but that is easy to sort out. The point is don't keep recreating the axes using plt.add_subplot. Just create them once, and then plot on them.

How to plot Venn diagram in python for two sets when one set is a subset of another?

I use the following code to plot Venn diagrams. The problem is when one set is a subset of the other the code doesn't work (see the figure). How can I change the following code to make it work when one set is a subset of another? In this case, I expect the red circle to be inside the green circle (the color then probably should be the same as the color of the overlapped area instead of red).
sets = Counter()
sets['01'] = 10
sets['11'] = 5
sets['10'] = 5
setLabels = ['Set 1', 'set 2']
plt.figure()
ax = plt.gca()
v = venn2(subsets = sets, set_labels = setLabels, ax = ax)
h, l = [],[]
for i in sets:
# remove label by setting them to empty string:
v.get_label_by_id(i).set_text("")
# append patch to handles list
h.append(v.get_patch_by_id(i))
# append count to labels list
l.append(sets[i])
#create legend from handles and labels
ax.legend(handles=h, labels=l, title="Numbers")
plt.title("venn_test")
plt.savefig("test_venn.png")
pdb.set_trace()
You can define sets['10'] = 0, to make the red part (set 1 without set 2) empty. To prevent that empty set from showing up in the legend, slice the handles and labels in the legend call accordingly: ax.legend(handles=h[0:2], labels=l[0:2], title="Numbers")
So change the code to this:
sets = Counter()
sets['01'] = 10
sets['11'] = 5
sets['10'] = 0 # set 1 without set 2 is empty
setLabels = ['Set 1', 'set 2']
plt.figure()
ax = plt.gca()
v = venn2(subsets = sets, set_labels = setLabels, ax = ax)
h, l = [],[]
for i in sets:
# remove label by setting them to empty string:
v.get_label_by_id(i).set_text("")
# append patch to handles list
h.append(v.get_patch_by_id(i))
# append count to labels list
l.append(sets[i])
#create legend from handles and labels, without the empty part
ax.legend(handles=h[0:2], labels=l[0:2], title="Numbers")
plt.title("venn_test")
plt.savefig("test_venn.png")
pdb.set_trace()

matplotlib subplot of 1:2:1

I'm trying to plot an A4 PDF with the following layout:
1 chart spanning 2 columns
2 charts, each spanning 1 column
1 chart spanning 2 columns
I have the following code:
fig = plt.figure(figsize=(8.27,11.69))
ax = fig.add_subplot(311)
ax = fig.add_subplot(323)
ax = fig.add_subplot(324)
ax = fig.add_subplot(315)
but i'm getting the following error:
ValueError: num must be 1 <= num <= 3, not 5
what am i missing?
This is the correct syntax:
fig = plt.figure()
ax = fig.add_subplot(311)
ax = fig.add_subplot(323)
ax = fig.add_subplot(324)
ax = fig.add_subplot(313)
However, for this kind of thing, you will gain in readability if you use GridSpec https://matplotlib.org/3.1.1/tutorials/intermediate/gridspec.html The following code yields the exact same output, but is (at least to me) easier to understand
fig = plt.figure()
gs = fig.add_gridspec(3, 2)
fig.add_subplot(gs[0, :])
fig.add_subplot(gs[1, 0])
fig.add_subplot(gs[1, 1])
fig.add_subplot(gs[2, :])

How to plot 4-D data embedded in a dataframe in Julia using a subplots approach?

I have a Julia DataFrame where the first 4 columns are dimensions and the 5th one contains the actual data.
I would like to plot it using a subplots approach where the two main plot axis concern the first two dimensions and each subplot then is a contour plot over the remaining two dimensions.
I am almost there with the above code:
using DataFrames,Plots
# plotlyjs() # doesn't work with plotlyjs backend
pyplot()
X = [1,2,3,4]
Y = [0.1,0.15,0.2]
I = [2,4,6,8,10,12,14]
J = [10,20,30,40,50,60]
df = DataFrame(X=Int64[], Y=Float64[], I=Float64[], J=Float64[], V=Float64[] )
[push!(df,[x,y,i,j,(5*x+20*y+2)*(0.2*i^2+0.5*j^2+3*i*j+2*i^2*j+1)]) for x in X, y in Y, i in I, j in J]
minvalue = minimum(df[:V])
maxvalue = maximum(df[:V])
function toDict(df, dimCols, valueCol)
toReturn = Dict()
for r in eachrow(df)
keyValues = []
[push!(keyValues,r[d]) for d in dimCols]
toReturn[(keyValues...)] = r[valueCol]
end
return toReturn
end
dict = toDict(df, [:X,:Y,:I,:J], :V )
M = [dict[(x,y,i,j)] for j in J, i in I, y in Y, x in X ]
yL = length(Y)
xL = length(X)
plot(contour(M[:,:,3,1], ylabel="y = $(string(Y[3]))", zlims=(minvalue,maxvalue)), contour(M[:,:,3,2]), contour(M[:,:,3,3]), contour(M[:,:,3,4]),
contour(M[:,:,2,1], ylabel="y = $(string(Y[2]))", zlims=(minvalue,maxvalue)), contour(M[:,:,2,2]), contour(M[:,:,2,3]), contour(M[:,:,2,4]),
contour(M[:,:,1,1], ylabel="y = $(string(Y[1]))", xlabel="x = $(string(X[1]))"), contour(M[:,:,1,2], xlabel="x = $(string(X[2]))"), contour(M[:,:,1,3], xlabel="x = $(string(X[3]))"), contour(M[:,:,3,4], xlabel="x = $(string(X[4]))"),
layout=(yL,xL) )
This produces:
I remain however with the following concerns:
How do I automatize the creation of each subplot in the subplot call ? Do I need to write a macro ?
I would like each subplot to have the same limits in the z axis, but zlims seems not to work. Is zlims not yet supported ?
How do I hide the legend on the z axis on each subplot and plot it instead apart (best would be on the right side of the main/total plot) ?
EDIT:
For the first point I don't need a macro, I can create the subplots in a for loop, add them in a array and pass the array to the plot() call using the ellipsis operator:
plots = []
for y in length(Y):-1:1
for x in 1:length(X)
xlabel = y == 1 ? "x = $(string(X[x]))" : ""
ylabel = x==1 ? "y = $(string(Y[y]))" : ""
println("$y - $x")
plot = contour(I,J,M[:,:,y,x], xlabel=xlabel, ylabel=ylabel, zlims=(minvalue,maxvalue))
push!(plots,plot)
end
end
plot(plots..., layout=(yL,xL))

matplotlib scatter plot using axes object in loop

I am having trouble using Matplotlib to plot multiple series in a loop (Matplotlib 1.0.0, Python 2.6.5, ArcGIS 10.0). Forum research pointed me to application of an Axes object, in order to plot multiple series on the same plot. I see how this works well for data generated outside of a loop (sample scripts), but when I insert the same syntax and add the second series into my loop that pulls data from database, I get the following error:
": unsupported operand type(s) for -: 'NoneType' and 'NoneType' Failed to execute (ChartAge8)."
Below is my code - any suggestions or comments are much appreciated!
import arcpy
import os
import matplotlib
import matplotlib.pyplot as plt
#Variables
FC = arcpy.GetParameterAsText(0) #feature class
P1_fld = arcpy.GetParameterAsText(1) #score field to chart
P2_fld = arcpy.GetParameterAsText(2) #score field to chart
plt.subplots_adjust(hspace=0.4)
nsubp = int(arcpy.GetCount_management(FC).getOutput(0)) #pulls n subplots from FC
last_val = object()
#Sub-plot loop
cur = arcpy.SearchCursor(FC, "", "", P1_fld)
i = 0
x1 = 1 # category 1 locator along x-axis
x2 = 2 # category 2 locator along x-axis
fig = plt.figure()
for row in cur:
y1 = row.getValue(P1_fld)
y2 = row.getValue(P2_fld)
i += 1
ax1 = fig.add_subplot(nsubp, 1, i)
ax1.scatter(x1, y1, s=10, c='b', marker="s")
ax1.scatter(x2, y2, s=10, c='r', marker="o")
del row, cur
#Save plot to pdf, open
figPDf = r"path.pdf"
plt.savefig(figPDf)
os.startfile("path.pdf")
If what you want to do is plot several stuff reusing the same plot what you should do it create the figure object outside the loop and then plot to that same object everytime, something like this:
fig = plt.figure()
for row in cur:
y1 = row.getValue(P1_fld)
y2 = row.getValue(P2_fld)
i += 1
ax1 = fig.add_subplot(nsubp, 1, i)
ax1.scatter(x1, y1, s=10, c='b', marker="s")
ax1.scatter(x2, y2, s=10, c='r', marker="o")
del row, cur