How to plot a rectangle behind a function over time - matplotlib

I'm plotting a timeseries in pandas using matplotlib and I'm trying to color a plot look like this.
I have the times for the A-F points. I've tried to get the position of them in the plot using
gcf().canvas.mpl_connect('button_press_event', debug_print_onclick_event)
and ended up with x positions being around 22'395'850 (not even close to unixtime :S)
My code basically looks like this:
plot = data.plot(legend=False) #where data is the timeseries (pandas.DataFrame).
plot.add_patch(
plt.Rectangle(
(0,22395760),
60,
45,
facecolor='green',
edgecolor='green'
)
)
plt.draw()
plt.show()
But nothings of the patch shows up.
Also tested to use time directly, it actually ran but no patch was rendered.
plt.Rectangle(
(0,datetime_D),
60,
4*pandas.datetools.Minutes(15),
facecolor='green',
edgecolor='green'
)
What is the underlying type? How should I position things in time in matplotlib? Any uglyhack working is appreciated.

You seem to have swapped x and y as first argument of Rectangle((x,y), ...).
Rectangle((22395760, 0), ...)
Instead of using a patch, plot.axvspan() seems a better match for what you want to do.
plt.gca().axvspan(date,date+2*pandas.datetools.Minute(15),facecolor='green',edge‌ color='green',alpha=0.3)

Related

Matplotlib/Seaborn: Boxplot collapses on x axis

I am creating a series of boxplots in order to compare different cancer types with each other (based on 5 categories). For plotting I use seaborn/matplotlib. It works fine for most of the cancer types (see image right) however in some the x axis collapses slightly (see image left) or strongly (see image middle)
https://i.imgur.com/dxLR4B4.png
Looking into the code how seaborn plots a box/violin plot https://github.com/mwaskom/seaborn/blob/36964d7ffba3683de2117d25f224f8ebef015298/seaborn/categorical.py (line 961)
violin_data = remove_na(group_data[hue_mask])
I realized that this happens when there are too many nans
Is there any possibility to prevent this collapsing by code only
I do not want to modify my dataframe (replace the nans by zero)
Below you find my code:
boxp_df=pd.read_csv(pf_in,sep="\t",skip_blank_lines=False)
fig, ax = plt.subplots(figsize=(10, 10))
sns.violinplot(data=boxp_df, ax=ax)
plt.xticks(rotation=-45)
plt.ylabel("label")
plt.tight_layout()
plt.savefig(pf_out)
The output is a per cancer type differently sized plot
(depending on if there is any category completely nan)
I am expecting each plot to be in the same width.
Update
trying to use the order parameter as suggested leads to the following output:
https://i.imgur.com/uSm13Qw.png
Maybe this toy example helps ?
|Cat1|Cat2|Cat3|Cat4|Cat5
|3.93| |0.52| |6.01
|3.34| |0.89| |2.89
|3.39| |1.96| |4.63
|1.59| |3.66| |3.75
|2.73| |0.39| |2.87
|0.08| |1.25| |-0.27
Update
Apparently, the problem is not the data but the length of the title
https://github.com/matplotlib/matplotlib/issues/4413
Therefore I would close the question
#Diziet should I delete it or does my issue might help other ones?
Sorry for not including the line below in the code example:
ax.set_title("VERY LONG TITLE", fontsize=20)
It's hard to be sure without data to test it with, but I think you can pass the names of your categories/cancers to the order= parameter. This forces seaborn to use/display those, even if they are empty.
for instance:
tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", data=tips, order=['Thur','Fri','Sat','Freedom Day','Sun','Durin\'s Day'])

colorbars for grid of line (not contour) plots in matplotlib

I'm having trouble giving colorbars to a grid of line plots in Matplotlib.
I have a grid of plots, which each shows 64 lines. The lines depict the penalty value vs time when optimizing the same system under 64 different values of a certain hyperparameter h.
Since there are so many lines, instead of using a standard legend, I'd like to use a colorbar, and color the lines by the value of h. In other words, I'd like something that looks like this:
The above was done by adding a new axis to hold the colorbar, by calling figure.add_axes([0.95, 0.2, 0.02, 0.6]), passing in the axis position explicitly as parameters to that method. The colorbar was then created as in the example code here, by instantiating a ColorbarBase(). That's fine for single plots, but I'd like to make a grid of plots like the one above.
To do this, I tried doubling the number of subplots, and using every other subplot axis for the colorbar. Unfortunately, this led to the colorbars having the same size/shape as the plots:
Is there a way to shrink just the colorbar subplots in a grid of subplots like the 1x2 grid above?
Ideally, it'd be great if the colorbar just shared the same axis as the line plot it describes. I saw that the colorbar.colorbar() function has an ax parameter:
ax
parent axes object from which space for a new colorbar axes will be stolen.
That sounds great, except that colorbar.colorbar() requires you to pass in a imshow image, or a ContourSet, but my plot is neither an image nor a contour plot. Can I achieve the same (axis-sharing) effect using ColorbarBase?
It turns out you can have different-shaped subplots, so long as all the plots in a given row have the same height, and all the plots in a given column have the same width.
You can do this using gridspec.GridSpec, as described in this answer.
So I set the columns with line plots to be 20x wider than the columns with color bars. The code looks like:
grid_spec = gridspec.GridSpec(num_rows,
num_columns * 2,
width_ratios=[20, 1] * num_columns)
colormap_type = cm.cool
for (x_vec_list,
y_vec_list,
color_hyperparam_vec,
plot_index) in izip(x_vec_lists,
y_vec_lists,
color_hyperparam_vecs,
range(len(x_vecs))):
line_axis = plt.subplot(grid_spec[grid_index * 2])
colorbar_axis = plt.subplot(grid_spec[grid_index * 2 + 1])
colormap_normalizer = mpl.colors.Normalize(vmin=color_hyperparam_vec.min(),
vmax=color_hyperparam_vec.max())
scalar_to_color_map = mpl.cm.ScalarMappable(norm=colormap_normalizer,
cmap=colormap_type)
colorbar.ColorbarBase(colorbar_axis,
cmap=colormap_type,
norm=colormap_normalizer)
for (line_index,
x_vec,
y_vec) in zip(range(len(x_vec_list)),
x_vec_list,
y_vec_list):
hyperparam = color_hyperparam_vec[line_index]
line_color = scalar_to_color_map.to_rgba(hyperparam)
line_axis.plot(x_vec, y_vec, color=line_color, alpha=0.5)
For num_rows=1 and num_columns=1, this looks like:

matplotlib/pyplot: print only ticks once in scatter plot?

I am looking for a way to clean-up the ticks in my pyplot scatter plot.
To create a scatter plot from a Pandas dataset column with strings as elements, I followed the example in [2] - and got me a nice scatter plot:
input are 10k data points where the X axis has only ~200 unique 'names', that got matched to scalars for plotting. Obviously, plotting all the 10k ticks on the x axis is a bit clocked. So, I am looking for a way, to print each unique tick only once and not for each data point?
My code looks like:
fig2 = plt.figure()
WNsUniques, WNs = numpy.unique(taskDataFrame['modificationhost'], return_inverse=True)
scatterWNs = fig2.add_subplot(111)
scatterWNs.scatter(WNs, taskDataFrame['cpuconsumptiontime'])
scatterWNs.set(xticks=range(len(WNsUniques)), xticklabels=WNsUniques)
plt.xticks(rotation='vertical')
plt.savefig("%s_WNs-CPUTime_scatter.%s" % (dfName,"pdf"))
actually, I was hoping that setting the plot x ticks to the unique names should be sufficient - but apparently not? Probably it is something easy, but how do I reduce the ticks for my subplot to unique once (should they not already be uniqueified as returned by numpy.unique?)?
Maybe someone has an idea for me?
Cheers ans thanks,
Thomas
You can use the set_xticks method to accomplish this. Note that 200 axis ticks with labels are still quite a lot to force on a small plot like this, and this is what you might already be seeing with the above code. Without complete code to play with, I can't say for sure.
Additionally, what is the size of WNsUniques? That can easily be used to check if your call to unique is doing what you think.

how to shift x axis labesl on line plot?

I'm using pandas to work with a data set and am tring to use a simple line plot with error bars to show the end results. It's all working great except that the plot looks funny.
By default, it will put my 2 data groups at the far left and right of the plot, which obscures the error bar to the point that it's not useful (the error bars in this case are key to intpretation so I want them plainly visible).
Now, I fix that problem by setting xlim to open up some space on either end of the x axis so that the error bars are plainly visible, but then I have an offset from where the x labels are to where the actual x data is.
Here is a simplified example that shows the problem:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df6 = pd.DataFrame( [-0.07,0.08] , index = ['A','B'])
df6.plot(kind='line', linewidth=2, yerr = [ [0.1,0.1],[0.1,0.1 ] ], elinewidth=2,ecolor='green')
plt.xlim(-0.2,1.2) # Make some room at ends to see error bars
plt.show()
I tried to include a plot (image) showing the problem but I cannot post images yet, having just joined up and do not have anough points yet to post images.
What I want to know is: How do I shift these labels over one tick to the right?
Thanks in advance.
Well, it turns out I found a solution, which I will jsut post here in case anyone else has this same issue in the future.
Basically, it all seems to work better in the case of a line plot if you just specify both the labels and the ticks in the same place at the same time. At least that was helpful for me. It sort of forces you to keep the length of those two lists the same, which seems to make the assignment between ticks and labels more well behaved (simple 1:1 in this case).
So I coudl fix my problem by including something like this:
plt.xticks([0, 1], ['A','B'] )
right after the xlim statement in code from original question. Now the A and B align perfectly with the place where the data is plotted, not offset from it.
Using above solution it works, but is less good-looking since now the x grid is very coarse (this is purely and aesthetic consideration). I could fix that by using a different xtick statement like:
plt.xticks([-0.2, 0, 0.2, 0.4, 0.6, 0.8, 1.0], ['','A','','','','','B',''])
This gives me nice looking grid and the data where I need it, but of course is very contrived-looking here. In the actual program I'd find a way to make that less clunky.
Hope that is of some help to fellow seekers....

matplotlib: working with range in x-axis

I'm trying to do a basic line graph here, but I can't seem to figure out how to adjust my x axis.
And here is the error I get when I try adjusting my range.
from pylab import *
plot ( range(0,11),[9,4,5,2,3,5,7,12,2,3],'.-',label='sample1' )
plot ( range(0,11),[12,5,33,2,4,5,3,3,22,10],'o-',label='sample2' )
xlabel('x axis')
ylabel('y axis')
title('my sample graphs')
legend(('sample1','sample2'))
savefig("sampleg.png",dpi=(640/8))
show()
File "C:\Python26\lib\site-packages\matplotlib\axes.py", line 228, in _xy_from_xy
raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension
I want my range to be a list of strings: ["12/1/2007","12/1/2008", "12/1/2009","12/1/2010"]
Any suggestions?
Honestly, I found the code online and was trying to rewrite it to properly understand it. I think I'm going to start from scratch so that I know what I'm doing but I need help on where to start.
I posted another question which explains what I want to do here:
Using PyLab to create a 2D graph from two separate lists
range(0,11) should be range(0,10).
In addition to Steve's observation: If your points are always some y-value at the same consecutive integer x's, matplotlib makes the range even implicit.
plot([9,4,5,2,3,5,7,12,2,3],'.-',label='sample1')