Increase the length of hline marker in matplotlib - matplotlib

I want to plot data using a constant, not too small, horizontal line for each value.
It seems the way to do it is with
x = np.arange(0, 10, 2)
y = [2,3,4,1,7]
plt.scatter(x, y, marker="_")
plt.legend(loc='Height')
plt.show()
but the horizontal lines are too small. Can they be customized to some greater length, at least a length similar to thewidth of a bar plot? Thx.

Do you mean that you want to increase marker size?
plt.scatter(x, y, marker="_", s=400)
s=1000

Related

Creating a grid of polar histograms (python)

I wish to create a sub plot that looks like the following picture,
it is supposed to contain 25 polar histograms, and I wish to add them to the plot one by one.
needs to be in python.
I already figured I need to use matplotlib but can't seem to figure it out completely.
thanks a lot!
You can create a grid of polar axes via projection='polar'.
hist creates a histogram, also when working with polar axes. Note that the x is in radians with a range of 2π. It works best when you give the bins explicitly as a linspace from 0 to 2π (or from -π to π, depending on the data). The third parameter of linspace should be one more than the number of bars that you'd want for the full circle.
About the exact parameters of axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30), endpoint=True), color='dodgerblue', ec='black'):
axs[i][j] draw on the jth subplot of the ith line
.hist create a histogram
x: the values that are put into bins
bins=: to enter the bins (either a fixed number between lowest and highest x or some explicit boundaries; default is 10 fixed boundaries)
np.random.randint(7, 30) a random whole number between 7 and 29
np.linspace(0, 2 * np.pi, n, endpoint=True) divide the range between 0 and 2π into n equal parts; endpoint=True makes boundaries at 0, at 2π and at n-2 positions in between; when endpoint=False there will be a boundary at 0, at n-1 positions in between but none at the end
color='dodgerblue': the color of the histogram bars will be blueish
ec='black': the edge color of the bars will be black
import numpy as np
import matplotlib.pyplot as plt
fig, axs = plt.subplots(5, 5, figsize=(8, 8),
subplot_kw=dict(projection='polar'))
for i in range(5):
for j in range(5):
x = np.random.uniform(0, 2 * np.pi, 50)
axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30)), color='dodgerblue', ec='black')
plt.tight_layout()
plt.show()

fig.tight_layout() but plots still overlap

Imagine I have some dataset for wines and I find the top 5 wine producing countries:
# Find top 5 wine producing countries.
top_countries = wines_df.groupby('country').size().reset_index(name='n').sort_values('n', ascending=False)[:5]['country'].tolist()
Now that I have the values, I attempt to plot the results in 10 plots, 5 rows 2 columns.
fig = plt.figure(figsize=(16, 15))
fig.tight_layout()
i = 0
for c in top_countries:
c_df = wines_df[wines_df.country == c]
i +=1
ax1 = fig.add_subplot(5,2,i)
i +=1
ax2 = fig.add_subplot(5,2,i)
sns.kdeplot(c_df['points'], ax=ax1)
ax1.set_title("POINTS OF ALL WINES IN %s, n=%d" % (c.upper(), c_df.shape[0]), fontsize=16)
sns.boxplot(c_df['price'], ax=ax2)
ax2.set_title("PRICE OF ALL WINES IN %s, n=%d" % (c.upper(), c_df.shape[0]), fontsize=16)
plt.show()
Even with this result, I still have my subplots overlapping.
Am I doing something wrong? Using python3.6 with matplotlib==2.2.2
As Thomas Kühn said, you have to move tight_layout() after doing the plots, like in:
fig = plt.figure(figsize=(16, 15))
i = 0
for c in top_countries:
c_df = wines_df[wines_df.country == c]
i +=1
ax1 = fig.add_subplot(5,2,i)
i +=1
ax2 = fig.add_subplot(5,2,i)
sns.kdeplot(c_df['points'], ax=ax1)
ax1.set_title("POINTS OF ALL WINES IN %s, n=%d" % (c.upper(), c_df.shape[0]), fontsize=16)
sns.boxplot(c_df['price'], ax=ax2)
ax2.set_title("PRICE OF ALL WINES IN %s, n=%d" % (c.upper(), c_df.shape[0]), fontsize=16)
fig.tight_layout()
plt.show()
If it is still overlapping (this may happen in some seldom cases), you can specify the padding with:
fig.tight_layout(pad=0., w_pad=0.3, h_pad=1.0)
Where pad is the general padding, w_pad is the horizontal padding and h_pad is the vertical padding. Just try some values until your plot looks nicely. (pad=0., w_pad=.3, h_pad=.3) is a good start, if you want to have your plots as tight as possible.
Another possibility is to specify constrained_layout=True in the figure:
fig = plt.figure(figsize=(16, 15), constrained_layout=True)
Now you can delete the line fig.tight_layout().
edit:
One more thing I stumbled upon:
It seems like you are specifying your figsize so that it fits on a standard DIN A4 paper in centimeters (typical textwidth: 16cm). But figsize in matplotlib is in inches. So probably replacing the figsize with figsize=(16/2.54, 15/2.54) might be better.
I know that it is absolutely confusing that matplotlib internally uses inches as units, considering that it is mostly the scientific community and data engineers working with matplotlib (and these usually use SI units). As ImportanceOfBeingErnest pointed out, there are several discussions going on about how to implement other units than inches.

hiding tick value on the y axis that are negative

I am trying to hide any value on the y axis that is less than 0. I saw that to hide labels on the y axis I have to use something like this:
make_invisible = True
ax4.set_yticks(minor_ticks)
if (make_invisible):
yticks=ax4.yaxis.get_major_ticks()
yticks[0].label1.set_visible(False)
How can I tweak this so that if the ytick lable is negative it will be hidden?
You can use the set_xticks() method to simply set those ticks that you want on the x axis.
import matplotlib.pyplot as plt
plt.figure(figsize=(7,3))
plt.plot([-2,-1,0,1,2],[4,6,2,7,1])
ticks = [tick for tick in plt.gca().get_xticks() if tick >=0]
plt.gca().set_xticks(ticks)
plt.show()
Replacing every x by a y will give you the according behaviour on the y axis.

How to change colorbar's color (in some particular value interval)?

In matplotlib, I would like to change colorbar's color in some particular value interval. For example, I would like to change the seismic colorbar, to let the values between -0.5 and 0.5 turn white, how can I do this?
thank you very much
You basically need to create your own colormap that has the particular features you want. Of course it is possible to make use of existing colormaps when doing so.
Colormaps are always ranged between 0 and 1. This range will then be mapped to the data interval. So in order to create whites between -0.5 and 0.5 we need to know the range of data - let's say data goes from -1 to 1. We can then decide to have the lower (blues) part of the seismic map go from -1 to -0.5, then have white between -0.5 and +0.5 and finally the upper part of the seismic map (reds) from 0.5 to 1. In the language of a colormap this corresponds to the ranges [0,0.25], [0.25, 0.75] and [0.75,1]. We can then create a list, with the first and last 25% percent being the colors of the seismic map and the middle 50% white.
This list can be used to create a colormap, using matplotlib.colors.LinearSegmentedColormap.from_list("colormapname", listofcolors).
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
n=50
x = 0.5
lower = plt.cm.seismic(np.linspace(0, x, n))
white = plt.cm.seismic(np.ones(100)*0.5)
upper = plt.cm.seismic(np.linspace(1-x, 1, n))
colors = np.vstack((lower, white, upper))
tmap = matplotlib.colors.LinearSegmentedColormap.from_list('terrain_map_white', colors)
x = np.linspace(0,10)
X,Y = np.meshgrid(x,x)
z = np.sin(X) * np.cos(Y*0.4)
fig, ax = plt.subplots()
im = ax.imshow(z, cmap=tmap)
plt.colorbar(im)
plt.show()
For more general cases, you may need a color normalization (using matplotlib.colors.Normalize). See e.g. this example, where a certain color in the colormap is always fixed at a data value of 0, independent of the data range.

How do I add error bars on a histogram?

I've created a histogram to see the number of similar values in a list.
data = np.genfromtxt("Pendel-Messung.dat")
stdm = (np.std(data))/((700)**(1/2))
breite = 700**(1/2)
fig2 = plt.figure()
ax1 = plt.subplot(111)
ax1.set_ylim(0,150)
ax1.hist(data, bins=breite)
ax2 = ax1.twinx()
ax2.set_ylim(0,150/700)
plt.show()
I want to create error bars (the error being stdm) in the middle of each bar of the histogram. I know I can create errorbars using
plt.errorbar("something", data, yerr = stdm)
But how do I make them start in the middle of each bar? I thought of just adding breite/2, but that gives me an error.
Sorry, I'm a beginner! Thank you!
ax.hist returns the bin edges and the frequencies (n) so we can use those for x and y in the call to errorbar. Also, the bins input to hist takes either an integer for the number of bins, or a sequence of bin edges. I think you we trying to give a bin width of breite? If so, this should work (you just need to select an appropriate xmax):
n,bin_edges,patches = ax.hist(data,bins=np.arange(0,xmax,breite))
x = bin_edges[:-1]+breite/2.
ax.errorbar(x,n,yerr=stdm,linestyle='None')