bar plot - annotate the bars with some values - pandas

Immediately after creating a bar plot using pandas dataframe.plot function, I am trying to annotate the bars with some values that I have in a list. I have put the annotate command in a for loop. But, as soon as I run this piece of code, my ipython notebook stops working and it crashes.
When I remove the annotation part, the bar plot works fine. What could be the reason for this?
req_index = df.index[~df.index.isin(['99'])]
ax = df.ix[req_index,'count'].plot(kind="barh", figsize=(24,20), \
color = (0.10588235294117647, 0.6196078431372549, 0.4666666666666667))
y_values = ax.get_yticks().astype('int')
for i,indx in enumerate(req_index):
label_text = str(round(df.ix[indx,'percentage'], 4))
print label_text
x = df.ix[indx,'count']
y = y_values[i]
ax.annotate(label_text, xy = (x + 70000,y-3), size = 20)
#break

Related

Removing matplotlib plot figure from tkinter

I am building a control app, and i use a matplotlib.figure graph to represent some data. I made a function that will do the plotting, but the problem is removing the plot. I can't remove the plot from the window, as it just stays there no matter what. I've tried using tk_widget.place_forget(), tk_widget.destroy(), and figure_subplot.remove(), but the window stays there.
def plot_box_1(sx, sy, px, py):
fig = Figure(figsize=(sx/100, sy/100), dpi=100)
pl1 = fig.add_subplot(111)
pl1.plot(y)
pltwidget = FigureCanvasTkAgg(fig, master=window)
pltwidget.draw()
tkplt = pltwidget.get_tk_widget()
tkplt.place(x=px, y=py)
return tkplt, pl1
def b5_updater():
global tab, asdf, current_after
p1 = page[5]["p1"]
p2 = page[5]["p2"]
box_pos = page[5]["box_pos"]
if tab == 5:
if asdf == 100:
asdf = 0
p1['value'] = asdf
p2['value'] = 100 - asdf
y.append(asdf)
if len(y) >= plot_lim:
y.pop(0)
asdf += 1
page[5]["plt_wid"], page[5]["pl1"] = plot_box_1(250, 180, box_pos[2][0] + 25, box_pos[2][1] + 30)
current_after = window.after(20, b5_updater)
else:
window.after_cancel(current_after) if current_after is not None else None
p1.destroy()
p2.destroy()
page[5]["plt_wid"].place_forget()
page[5]["pl1"].remove()
page[5]["plt_wid"].destroy()
print("Closed B5 Successfully")
Tried .remove, .destroy, .place_forget on multiple items including the Canvas in tkinter, the graph still stayed there.
If you want to completely remove the tkinter widget the following works:
pltwidget.get_tk_widget().destroy()
If you just want to clear the figure, use:
fig.clear()
Both options work for me. You could test it in isolation for yourself by trying these options before the return value of def plot_box_1().

How to use hover events in mpl_connect in matplotlib

I'm working on line plotting a metric for a course module as well as each of its questions within a Jupyter Notebook using %matplotlib notebook. That part is no problem. A module has typically 20-35 questions, so it results in a lot of lines on a chart. Therefore, I am plotting the metric for each question in a low alpha and I want to change the alpha and display the question name when I hover over the line, then reverse those when no longer hovering over the line.
The thing is, I've tried every test version of interactivity from the matplotlib documentation on event handling, as well as those in this question. It seems like the mpl_connect event is never firing, whether I use click or hover.
Here's a test version with a reduced dataset using the solution to the question linked above. Am I missing something necessary to get events to fire?
def update_annot(ind):
x,y = line.get_data()
annot.xy = (x[ind["ind"][0]], y[ind["ind"][0]])
text = "{}, {}".format(" ".join(list(map(str,ind["ind"]))),
" ".join([names[n] for n in ind["ind"]]))
annot.set_text(text)
annot.get_bbox_patch().set_alpha(0.4)
def hover(event):
vis = annot.get_visible()
if event.inaxes == ax:
cont, ind = line.contains(event)
if cont:
update_annot(ind)
annot.set_visible(True)
fig.canvas.draw_idle()
else:
if vis:
annot.set_visible(False)
fig.canvas.draw_idle()
module = 'bd2bc472-ee0d-466f-8557-788cc6de3018'
module_metrics[module] = {
'q_count': 31,
'sequence_pks': [0.5274546300604932,0.5262044653349001,0.5360993905297703,0.5292329279700655,0.5268691588785047,0.5319099014547161,0.5305164319248826,0.5268235294117647,0.573648805381582,0.5647933116581514,0.5669839795681448,0.5646591970121382,0.5663157894736842,0.5646976090014064,0.5659005628517824,0.5693634879925391,0.5728268468888371,0.5668834184858337,0.5687237026647967,0.5795640965549567,0.5877684407096172,0.585690904839841,0.5766899766899767,0.5971341320178529,0.6059972105997211,0.6055516678329834,0.6209865053513262,0.6203121360354065,0.6153666510976179,0.6236909471724459,0.6387654898293196],
'q_pks': {
'0da04f02-4aad-4ac8-91a5-214862b5c0d0': [0.6686046511627907,0.6282051282051282,0.76,0.6746987951807228,0.7092198581560284,0.71875,0.6585365853658537,0.7070063694267515,0.7171052631578947,0.7346938775510204,0.7737226277372263,0.7380952380952381,0.6774193548387096,0.7142857142857143,0.7,0.6962962962962963,0.723404255319149,0.6737588652482269,0.7232704402515723,0.7142857142857143,0.7164179104477612,0.7317073170731707,0.6333333333333333,0.75,0.7217391304347827,0.7017543859649122,0.7333333333333333,0.7641509433962265,0.6869565217391305,0.75,0.794392523364486],
'10bd29aa-3a26-49e6-bc2c-50fd503d7ab5': [0.64375,0.6014492753623188,0.5968992248062015,0.5059523809523809,0.5637583892617449,0.5389221556886228,0.5576923076923077,0.51875,0.4931506849315068,0.5579710144927537,0.577922077922078,0.5467625899280576,0.5362318840579711,0.6095890410958904,0.5793103448275863,0.5159235668789809,0.6196319018404908,0.6143790849673203,0.5035971223021583,0.5897435897435898,0.5857142857142857,0.5851851851851851,0.6164383561643836,0.6054421768707483,0.5714285714285714,0.627906976744186,0.5826771653543307,0.6504065040650406,0.5864661654135338,0.6333333333333333,0.6851851851851852]
}}
suptitle_size = 24
title_size = 18
tick_size = 12
axis_label_size = 15
legend_size = 14
fig, ax = plt.subplots(figsize=(15,8))
fig.suptitle('PK by Sequence Order', fontsize=suptitle_size)
module_name = 'Test'
q_count = module_metrics[module]['q_count']
y_ticks = [0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
x_ticks = np.array([x for x in range(0,q_count)])
x_labels = x_ticks + 1
# Plot it
ax.set_title(module_name, fontsize=title_size)
ax.set_xticks(x_ticks)
ax.set_yticks(y_ticks)
ax.set_xticklabels(x_labels, fontsize=tick_size)
ax.set_yticklabels(y_ticks, fontsize=tick_size)
ax.set_xlabel('Sequence', fontsize=axis_label_size)
ax.set_xlim(-0.5,q_count-0.5)
ax.set_ylim(0,1)
ax.grid(which='major',axis='y')
# Output module PK by sequence
ax.plot(module_metrics[module]['sequence_pks'])
# Output PK by sequence for each question
for qid in module_metrics[module]['q_pks']:
ax.plot(module_metrics[module]['q_pks'][qid], alpha=0.15, label=qid)
annot = ax.annotate("", xy=(0,0), xytext=(-20,20),textcoords="offset points", bbox=dict(boxstyle="round", fc="w"), arrowprops=dict(arrowstyle="->"))
annot.set_visible(False)
mpl_id = fig.canvas.mpl_connect('motion_notify_event', hover)
Since there are dozens of modules, I created an ipywidgets dropdown to select the module, which then runs a function to output the chart. Nonetheless, whether running it hardcoded as here or from within the function, mpl_connect never seems to fire.
Here's what this one looks like when run

Issue when trying to plot geom_tile using ggplotly

I would like to plot a ggplot2 image using ggplotly
What I am trying to do is to initially plot rectangles of grey fill without any aesthetic mapping, and then in a second step to plot tiles and change colors based on aesthetics. My code is working when I use ggplot but crashes when I try to use ggplotly to transform my graph into interactive
Here is a sample code
library(ggplot2)
library(data.table)
library(plotly)
library(dplyr)
x = rep(c("1", "2", "3"), 3)
y = rep(c("K", "B","A"), each=3)
z = sample(c(NA,"A","L"), 9,replace = TRUE)
df <- data.table(x,y,z)
p<-ggplot(df)+
geom_tile(aes(x=x,y=y),width=0.9,height=0.9,fill="grey")
p<-p+geom_tile(data=filter(df,z=="A"),aes(x=x,y=y,fill=z),width=0.9,height=0.9)
p
But when I type this
ggplotly(p)
I get the following error
Error in [.data.frame(g, , c("fill_plotlyDomain", "fill")) :
undefined columns selected
The versions I use are
> packageVersion("plotly")
1 ‘4.7.1
packageVersion("ggplot2")
1 ‘2.2.1.9000’
##########Edited example for Arthur
p<-ggplot(df)+
geom_tile(aes(x=x,y=y,fill="G"),width=0.9,height=0.9)
p<- p+geom_tile(data=filter(df,z=="A"),aes(x=x,y=y,fill=z),width=0.9,height=0.9)
p<-p+ scale_fill_manual(
guide = guide_legend(title = "test",
override.aes = list(
fill =c("red","white") )
),
values = c("red","grey"),
labels=c("A",""))
p
This works
but ggplotly(p) adds the grey bar labeled G in the legend
The output of the ggplotly function is a list with the plotly class. It gets printed as Plotly graph but you can still work with it as a list. Moreover, the documentation indicates that modifying the list makes it possible to clear all or part of the legend. One only has to understand how the data is structured.
p<-ggplot(df)+
geom_tile(aes(x=x,y=y,fill=z),width=0.9,height=0.9)+
scale_fill_manual(values = c(L='grey', A='red'), na.value='grey')
p2 <- ggplotly(p)
str(p2)
The global legend is here in p2$x$layout$showlegend and setting this to false displays no legend at all.
The group-specific legend appears at each of the 9 p2$x$data elements each time in an other showlegend attribute. Only 3 of them are set to TRUE, corresponding to the 3 keys in the legend. The following loop thus clears all the undesired labels:
for(i in seq_along(p2$x$data)){
if(p2$x$data[[i]]$legendgroup!='A'){
p2$x$data[[i]]$showlegend <- FALSE
}
}
Voilà!
This works here:
ggplot(df)+
geom_tile(aes(x=x,y=y,fill=z),width=0.9,height=0.9)+
scale_fill_manual(values = c(L='grey', A='red'), na.value='grey')
ggplotly(p)
I guess your problem comes from the use of 2 different data sources, df and filter(df,z=="A"), with columns with the same name.
[Note this is not an Answer Yet]
(Putting for reference, as it is beyond the limits for comments.)
The problem is rather complicated.
I just finished debugging the code of plotly. It seems like it's occurring here.
I have opened an issue in GitHub
Here is the minimal code for the reproduction of the problem.
library(ggplot2)
set.seed(1503)
df <- data.frame(x = rep(1:3, 3),
y = rep(1:3, 3),
z = sample(c("A","B"), 9,replace = TRUE),
stringsAsFactors = F)
p1 <- ggplot(df)+
geom_tile(aes(x=x,y=y, fill="grey"), color = "black")
p2 <- ggplot(df)+
geom_tile(aes(x=x,y=y),fill="grey", color = "black")
class(plotly::ggplotly(p1))
#> [1] "plotly" "htmlwidget"
class(plotly::ggplotly(p2))
#> Error in `[.data.frame`(g, , c("fill_plotlyDomain", "fill")): undefined columns selected

Retrieve yerr value from bar object in matplotlib

How can I retrieve a yerr value from an ax.bar object?
A bar chart is created with a single line, each parameter of the ax.bar() is a collection, including the yerr value.
bar_list = ax.bar(x_value_list, y_value_list, color=color_list,
tick_label=columns, yerr=confid_95_list, align='center')
Later on, I want to be able to retrieve both the y value as well as the yerr value of each individual bar in the chart.
I iterate through the bar_list collection and I can retrieve the y value, but I don't know how to retrieve the yerr value.
Getting the y value looks like this:
for bar in bar_list:
y_val = bar.get_height()
How can I get the yerr? Is there something like a bar.get_yerr() method? (It isn't bar.get_yerr())
I would like to be able to:
for bar in bar_list:
y_err = bar.get_yerr()
Note that in the above example confid_95_list is already the list of errors. So there is no need to obtain them from the plot.
To answer the question: In the line for bar in bar_list, bar is a Rectangle and thus has no errorbar associated to it.
However bar_list is a bar container with an attribute errorbar, which contains the return of the errorbar creation. You may then get the individual segments of the line collection. Each line goes from yminus = y - y_error to yplus = y + y_error; the line collection only stores the points yminus, yplus. As an example:
means = (20, 35)
std = (2, 4)
ind = np.arange(len(means))
p = plt.bar(ind, means, width=0.35, color='#d62728', yerr=std)
lc = [i for i in p.errorbar.get_children() if i is not None][0]
for yerr in lc.get_segments():
print (yerr[:,1]) # print start and end point
print (yerr[1,1]- yerr[:,1].mean()) # print error
will print
[ 18. 22.]
2.0
[ 31. 39.]
4.0
So this works well for symmectric errorbars. For asymmectric errorbars, you would additionally need to take the point itself into account.
means = (20, 35)
std = [(2,4),(5,3)]
ind = np.arange(len(means))
p = plt.bar(ind, means, width=0.35, color='#d62728', yerr=std)
lc = [i for i in p.errorbar.get_children() if i is not None][0]
for point, yerr in zip(p, lc.get_segments()):
print (yerr[:,1]) # print start and end point
print (yerr[:,1]- point.get_height()) # print error
will print
[ 18. 25.]
[-2. 5.]
[ 31. 38.]
[-4. 3.]
At the end this seems unnecessarily complicated because you only retrieve the values that you initially put in, means and std and you could simply use those values for whatever you want to do.

ggplot plotly API mess width stack bar graph

I am using plotly library to get me HTML interactive graph, which i already generating from ggplot2, but with stacked graph, plotly doesnt work properly.
Here is my ggplot code :
if(file.exists(filename)) {
data = read.table(filename,sep=",",header=T)
} else {
g <- paste0("=== [E] Error : Couldn't Found File : ",filename)
print (g)
}
ReadChData <- data[data$Channel %in% c("R"),]
#head(ReadChData,10)
# calculate midpoints of bars (simplified using comment by #DWin)
Data <- ddply(ReadChData, .(qos_level),
transform, pos = cumsum(AvgBandwidth) - (0.5 *AvgBandwidth)
)
# library(dplyr) ## If using dplyr...
# Data <- group_by(Data,Year) %>%
# mutate(pos = cumsum(Frequency) - (0.5 * Frequency))
# plot bars and add text
g <- ggplot(Data, aes(x = qos_level, y = AvgBandwidth)) +
scale_x_continuous(breaks = x_axis_break) +
geom_bar(aes(fill = MasterID), stat="identity", width=0.2) +
scale_colour_gradientn(colours = rainbow(7)) +
geom_text(aes(label = AvgBandwidth, y = pos), size = 3) +
theme_set(theme_bw()) +
ylab("Bandwidth (GB/s)") +
xlab("QoS Level") +
ggtitle("Qos Compting Stream")
png(paste0(opt$out,"/",GraphName,".png"),width=6*ppi, height=6*ppi, res=ppi)
print (g)
library(plotly)
p <- ggplotly(g)
#libdir arugumet will be use to point to commin lib
htmlwidgets::saveWidget(as.widget(p), selfcontained=FALSE, paste0(opt$out,"/qos_competing_stream.html"))
and here is HTML output form plotly lib
http://pasteboard.co/2fHQfJwFu.jpg
Please help.
This is perhaps quite a bit late to answer. But for someone who might have the issue in future...
The geom_bar's width parameter is not recognized by ggplotly function.
Work Around :
A work around (not very good one) by using parameters colour="white", size = 1. This basically adds a white line around the bars, making an effect like white space.
You could try the following:
stat_summary(aes(fill = MasterID), geom="bar", colour="white", size = 1, fun.y = "sum", position = "stack")
Better solution :
Use bargap parameter from layout function. The code should be:
ggplotly(type='bar', ...) %>% layout(bargap = 3, autosize=T)
P.S. the code in question code is not executable, throws an error due to missing filename.