Plot axvline from Point to Point in Matplotlib Python 3.6 - matplotlib

I am reading Data from a Simulation out of an Excel File. Out of this Data I generated two DataFrames containing 200 values. Now i want to plot all the Values from DataFrame one in blue and all Values from DataFrame two in purple. Therefore I have following code:
df = pd.read_excel("###CENSORED####.xlsx", sheetname="Data")
unpatched = df["Unpatched"][:-800]
patched = df["Patched"][:-800]
x = range(0,len(unpatched))
fig = plt.figure(figsize=(10, 5))
plt.scatter(x, unpatched, zorder=10, )
plt.scatter(x, patched, c="purple",zorder=19,)
This results in following Graph:
But now i want to draw in some lines that visualize the difference between the blue and purple dots. I thought about an orange line going from blue dot at simulation-run x to the purple dot at simulation-run x. I've tried to "cheat" with following code, since I'm pretty new to matplotlib.
scale_factor = 300
for a in x:
plt.axvline(a, patched[a]/scale_factor, unpatched[a]/scale_factor, c="orange")
But this resulted in a inaccuracy as seen seen below:
So is there a smarter way to do this? I've realized that the axvline documentation only says that ymin, ymax can only be scalars. Can I somehow turn my given values into fitting scalars?

Related

How to display all the lables present on x and y axis in matplotlib [duplicate]

I'm playing around with the abalone dataset from UCI's machine learning repository. I want to display a correlation heatmap using matplotlib and imshow.
The first time I tried it, it worked fine. All the numeric variables plotted and labeled, seen here:
fig = plt.figure(figsize=(15,8))
ax1 = fig.add_subplot(111)
plt.imshow(df.corr(), cmap='hot', interpolation='nearest')
plt.colorbar()
labels = df.columns.tolist()
ax1.set_xticklabels(labels,rotation=90, fontsize=10)
ax1.set_yticklabels(labels,fontsize=10)
plt.show()
successful heatmap
Later, I used get_dummies() on my categorical variable, like so:
df = pd.get_dummies(df, columns = ['sex'])
resulting correlation matrix
So, if I reuse the code from before to generate a nice heatmap, it should be fine, right? Wrong!
What dumpster fire is this?
So my question is, where did my labels go, and how do I get them back?!
Thanks!
To get your labels back, you can force matplotlib to use enough xticks so that all your labels can be shown. This can be done by adding
ax1.set_xticks(np.arange(len(labels)))
ax1.set_yticks(np.arange(len(labels)))
before your statements ax1.set_xticklabels(labels,rotation=90, fontsize=10) and ax1.set_yticklabels(labels,fontsize=10).
This results in the following plot:

Matplotlib/Seaborn: Boxplot collapses on x axis

I am creating a series of boxplots in order to compare different cancer types with each other (based on 5 categories). For plotting I use seaborn/matplotlib. It works fine for most of the cancer types (see image right) however in some the x axis collapses slightly (see image left) or strongly (see image middle)
https://i.imgur.com/dxLR4B4.png
Looking into the code how seaborn plots a box/violin plot https://github.com/mwaskom/seaborn/blob/36964d7ffba3683de2117d25f224f8ebef015298/seaborn/categorical.py (line 961)
violin_data = remove_na(group_data[hue_mask])
I realized that this happens when there are too many nans
Is there any possibility to prevent this collapsing by code only
I do not want to modify my dataframe (replace the nans by zero)
Below you find my code:
boxp_df=pd.read_csv(pf_in,sep="\t",skip_blank_lines=False)
fig, ax = plt.subplots(figsize=(10, 10))
sns.violinplot(data=boxp_df, ax=ax)
plt.xticks(rotation=-45)
plt.ylabel("label")
plt.tight_layout()
plt.savefig(pf_out)
The output is a per cancer type differently sized plot
(depending on if there is any category completely nan)
I am expecting each plot to be in the same width.
Update
trying to use the order parameter as suggested leads to the following output:
https://i.imgur.com/uSm13Qw.png
Maybe this toy example helps ?
|Cat1|Cat2|Cat3|Cat4|Cat5
|3.93| |0.52| |6.01
|3.34| |0.89| |2.89
|3.39| |1.96| |4.63
|1.59| |3.66| |3.75
|2.73| |0.39| |2.87
|0.08| |1.25| |-0.27
Update
Apparently, the problem is not the data but the length of the title
https://github.com/matplotlib/matplotlib/issues/4413
Therefore I would close the question
#Diziet should I delete it or does my issue might help other ones?
Sorry for not including the line below in the code example:
ax.set_title("VERY LONG TITLE", fontsize=20)
It's hard to be sure without data to test it with, but I think you can pass the names of your categories/cancers to the order= parameter. This forces seaborn to use/display those, even if they are empty.
for instance:
tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", data=tips, order=['Thur','Fri','Sat','Freedom Day','Sun','Durin\'s Day'])

How do I use colourmaps with variable alpha in a Seaborn kdeplot without seeing the contour lines?

Python version: 3.6.4 (Anaconda on Windows)
Seaborn: 0.8.1
Matplotlib: 2.1.2
I'm trying to create a 2D Kernel Density plot using Seaborn but I want each step in the colourmap to have a different alpha value. I had a look at this question to create a matplotlib colourmap with alpha values: Add alpha to an existing matplotlib colormap.
I have a problem in that the lines between contours are visible. The result I get is here:
I thought that I had found the answer when I found this question: Hide contour linestroke on pyplot.contourf to get only fills. I tried the method outlined in the answer (using set_edgecolor("face") but it did not work in this case. That question also seemed to be related to vector graphics formats and I am just writing out a PNG.
Here is my script:
import numpy as np
import seaborn as sns
import matplotlib.colors as cols
import matplotlib.pyplot as plt
def alpha_cmap(cmap):
my_cmap = cmap(np.arange(cmap.N))
# Set a square root alpha.
x = np.linspace(0, 1, cmap.N)
my_cmap[:,-1] = x ** (0.5)
my_cmap = cols.ListedColormap(my_cmap)
return my_cmap
xs = np.random.uniform(size=100)
ys = np.random.uniform(size=100)
kplot = sns.kdeplot(data=xs, data2=ys,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=30)
plt.savefig("example_plot.png")
Guided by some comments on this question I have tried some other methods that have been successful when this problem has come up. Based on this question (Matplotlib Contourf Plots Unwanted Outlines when Alpha < 1) I have tried altering the plot call to:
sns.kdeplot(data=xs, data2=ys,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=30,
antialiased=True)
With antialiased=True the lines between contours are replaced by a narrow white line:
I have also tried an approach similar to this question - Pyplot pcolormesh confused when alpha not 1. This approach is based on looping over the PathCollections in kplot.collections and tuning the parameters of the edges so that they become invisible. I have tried adding this code and tweaking the linewidth -
for thing in kplot.collections:
thing.set_edgecolor("face")
thing.set_linewidth(0.01)
fig.canvas.draw()
This results in a mix of white and dark lines - .
I believe that I will not be able to tune the line width to make the lines disappear because of the variable width of the contour bands.
Using both methods (antialiasing + linewidth) makes this version, which looks cool but isn't quite what I want:
I also found this question - Changing Transparency of/Remove Contour Lines in Matplotlib
This one suggests overplotting a second plot with a different number of contour levels on the same axis, like:
kplot = sns.kdeplot(data=xs, data2=ys,
ax=ax,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=30,
antialiased=True)
kplot = sns.kdeplot(data=xs, data2=ys,
ax=ax,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=35,
antialiased=True)
This results in:
This is better, and almost works. The problem here is I need variable (and non-linear) alpha throughout the colourmap. The variable banding and lines seem to be a result of the combinations of alpha when contours are plotted over each other. I also still see some clear/white lines in the result.

Matplotlib: imshow with second y axis

I'm trying to plot a two-dimensional array in matplotlib using imshow(), and overlay it with a scatterplot on a second y axis.
oneDim = np.array([0.5,1,2.5,3.7])
twoDim = np.random.rand(8,4)
plt.figure()
ax1 = plt.gca()
ax1.imshow(twoDim, cmap='Purples', interpolation='nearest')
ax1.set_xticks(np.arange(0,twoDim.shape[1],1))
ax1.set_yticks(np.arange(0,twoDim.shape[0],1))
ax1.set_yticklabels(np.arange(0,twoDim.shape[0],1))
ax1.grid()
#This is the line that causes problems
ax2 = ax1.twinx()
#That's not really part of the problem (it seems)
oneDimX = oneDim.shape[0]
oneDimY = 4
ax2.plot(np.arange(0,oneDimX,1),oneDim)
ax2.set_yticks(np.arange(0,oneDimY+1,1))
ax2.set_yticklabels(np.arange(0,oneDimY+1,1))
If I only run everything up to the last line, I get my array fully visualised:
However, if I add a second y axis (ax2=ax1.twinx()) as preparation for the scatterplot, it changes to this incomplete rendering:
What's the problem? I've left a few lines in the code above describing the addition of the scatterplot, although it doesn't seem to be part of the issue.
Following the GitHub discussion which Thomas Kuehn has pointed at, the issue has been fixed few days ago. In the absence of a readily available built, here's a fix using the aspect='auto' property. In order to get nice regular boxes, I adjusted the figure x/y using the array dimensions. The axis autoscale feature has been used to remove some additional white border.
oneDim = np.array([0.5,1,2.5,3.7])
twoDim = np.random.rand(8,4)
plt.figure(figsize=(twoDim.shape[1]/2,twoDim.shape[0]/2))
ax1 = plt.gca()
ax1.imshow(twoDim, cmap='Purples', interpolation='nearest', aspect='auto')
ax1.set_xticks(np.arange(0,twoDim.shape[1],1))
ax1.set_yticks(np.arange(0,twoDim.shape[0],1))
ax1.set_yticklabels(np.arange(0,twoDim.shape[0],1))
ax1.grid()
ax2 = ax1.twinx()
#Required to remove some white border
ax1.autoscale(False)
ax2.autoscale(False)
Result:

Change colour of curve according to its y-value in matplotlib [duplicate]

This question already has answers here:
Having line color vary with data index for line graph in matplotlib?
(4 answers)
Set line colors according to colormap
(1 answer)
Closed 8 years ago.
I'm trying to replicate the style of the attached figure using matplotlib's facilities.
Basically, I want to change the colour of the curve according to its y-value using matplotlib.
The plot you've shown doesn't have the color set by the vertical axis of the plot (which is what I would consider the y-value). Instead, it just has 8 different plots overlain, each with a different color, without stating what the color means.
Here's an example of something that looks like your plot:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
# some fake data:
x = np.linspace(0, 2*np.pi, 1000)
fs = np.arange(1, 5.)
ys = np.sin(x*fs[:, None])
for y, f in zip(ys, fs):
plt.plot(x, y, lw=3, c=cm.hot(f/5))
If you actually want the color of one line to change with respect to its value, you have to kind of hack it, because any given Line2D object can only have one color, as far as I know. One way to do this is to make a scatter plot, where each dot can have any color.
x = np.linspace(0, 2*np.pi, 1000)
y = np.sin(2*x)
plt.scatter(x,y, c=cm.hot(np.abs(y)), edgecolor='none')
Notes:
The color vector should range between 0 and 1, so if y.max() > 1, then normalize by it: c=cm.hot(y/y.max()) and make sure it's all positive.
I used edgecolor='none' because by default the scatter markers have a black outline which makes the it look less like a uniform line.
If your data is spaced too far, you'll have to interpolate the data if you don't want gaps between markers.