pyplot x-axis tick mark spacing is not centered with all columns - matplotlib

I'm struggling with what I hope is a misspecification of the pyplot histogram function. As you see in the image, the x-axis tick marks are not centered consistently on the columns as per the align='mid' parameter. If necessary, I will upload the data file to Dropbox.
Thanks for you help !
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import FormatStrFormatter
data = DRA_size_males_s
fig, ax = plt.subplots(nrows=1, ncols=1)
ax.hist(data, facecolor='blue', edgecolor='gray', bins=25, rwidth=1.10, align='mid')
bins=[1.4,1.5,1.6,1.7,1.9,2.0,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3.1,3.2,3.5,3.6,3.8]
ax.set_xticks(bins)
ax.set_ylabel('Frequency')
ax.set_xlabel('DRA Sizes(mm)')
ax.set_title('Frequencies of DRA Sizes in Males (mm)')
plt.show()
Here is the data array used to create the histogram:
1.4, 1.4, 1.4, 1.5, 1.5, 1.6, 1.7, 1.7, 1.7, 1.9, 1.9, 1.9, 1.9, 2.0, 2.0, 2.0, 2.1, 2.1, 2.1, 2.1, 2.2, 2.2, 2.3, 2.3, 2.3, 2.4, 2.5, 2.6, 2.7, 2.7, 2.8, 2.8, 2.8, 2.9, 2.9, 3.1, 3.1, 3.2, 3.2, 3.5, 3.6, 3.8

The plt.hist's align="mid" argument centers the bars of the histogram in the middle between the bin edges - this is in fact the usual way of plotting a histogram.
In order for the histogram to use predefined bin edges you need to supply those bin edges to the plt.hist function.
import matplotlib.pyplot as plt
import numpy as np
data = [1.4, 1.4, 1.4, 1.5, 1.5, 1.6, 1.7, 1.7, 1.7, 1.9, 1.9, 1.9, 1.9, 2.0,
2.0, 2.0, 2.1, 2.1, 2.1, 2.1, 2.2, 2.2, 2.3, 2.3, 2.3, 2.4, 2.5, 2.6,
2.7, 2.7, 2.8, 2.8, 2.8, 2.9, 2.9, 3.1, 3.1, 3.2, 3.2, 3.5, 3.6, 3.8]
fig, ax = plt.subplots(nrows=1, ncols=1)
bins=[1.4,1.5,1.6,1.7,1.9,2.0,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3.1,3.2,3.5,3.6,3.8]
ax.hist(data, bins=bins, facecolor='blue', edgecolor='gray', rwidth=1, align='mid')
ax.set_xticks(bins)
ax.set_ylabel('Frequency')
ax.set_xlabel('DRA Sizes(mm)')
ax.set_title('Frequencies of DRA Sizes in Males (mm)')
plt.show()

Try to use bins with a range of values minus a small offset as in the following example.
In [100]: x = np.array([1, 2, 3, 4, 0, 3, 1, 7, 4, 5, 8, 8, 9, 7, 7, 3])
In [101]: len(x)
Out[101]: 16
In [102]: bins = np.arange(10) - 0.5
In [103]: plt.hist(x, facecolor='blue', edgecolor='gray', bins=bins, rwidth=2, alpha=0.75)
Now, the bin numbers will be center aligned.

Related

Increasing the space between the bars in histplot [duplicate]

This question already has an answer here:
How to make Seaborn histogram have skinny bars / bins
(1 answer)
Closed 2 months ago.
I have plotted a simple histogram using Seaborn but the bars are stuck together and I would like to increase the space between them.
import seaborn as sns
sns.histplot([-2.0, -1.0, -2.0, 2.0, 3.0, 0.0, 4.0, -2.0, -2.0, -3.0, -2.0, 4.0])
I tried a couple of suggestions such as adding rwidth argument or using rwidth within hist_kws argument but neither worked for histplot.
IIUC you want to get some spaces between your bins, without actually changing their sizes. I would play with style and linewidth in this case:
sns.set_style("white")
sns.histplot([-2.0, -1.0, -2.0, 2.0, 3.0, 0.0, 4.0, -2.0, -2.0, -3.0, -2.0, 4.0], linewidth=5.0)
Output:

How to manually scale a continuous legend in a seaborn scatterplot?

I'm creating a scatterplot with seaborn like this:
plt.figure(figsize=(20,5))
ax = sns.scatterplot(x=x,
y=y,
hue=errors,
s=errors*20,
alpha=0.8,
edgecolors='w')
ax.set(xlabel='X', ylabel='Y')
ax.legend(title="Error (m)", loc='upper right')
My errors contain values between approximately 0.1 and 12.5. However, for my legend seaborn automatically generates labels 0, 5, 10, 15. This makes my algorithm look worse than it is. I would like to change the step size in the legend while maintaining a correct mapping between colors and error magnitudes. For example 0, 4, 8, 12.5. Is this possible?

Exponentially decaying interpolation

I am not sure if I used the correct technical words in the title. What I want is something like the following.
I have the following code
import pandas as pd
import numpy as np
df = pd.DataFrame([[1, None, None, 4, None, None, None, 10]])
df = df.fillna(np.nan)
df = df.transpose().interpolate()
which does a linear interpolation, which gives me something like
1.0 2.0 3.0 4.0 5.5 7.0 8.5 10.0
What I want is an exponentially decaying interpolation. That is, something like below (Not the exact values but you get the idea).
1.0 2.5 3.0 4.0 6.5 8.0 9.2 10.0
That is I want the closer values to change more drastically than the far values. Is there an interpolation method available in pandas that can do it?
You need to apply some transformations to the data. Try this:
df = pd.DataFrame([[1, None, None, 4, None, None, None, 10]])
df = df.fillna(np.nan)
df = 10**df
df = df.transpose().interpolate()
df = np.log10(df)
You can play with the powers to get something that matches what you need.

How to plot irregularly sampled time data in animated graph, using Matplotlib or other?

I have data that is logged in irregular time steps, and I want to show it as a scrolling animation over time. For example, data A may have time points [0.001, 0.004, 0.007, 0.009, ..., 0.97], and data B may have roughly the same, plus or minus 0.02 at each point.
I want to create a scrolling animation of the data being updated over time, but only have it update a line's points/vertices after that vertex's time has passed. I cant think of a good way to have numpy say "for this line, only count data that is up this timestamp". I think if I can get that I can figure something out from the matplotlib examples, but a full solution would be nice as well.
Thank you!
I think one solution would be to map the data into a consistent format:
A = np.array([1, 1.5, 4.5, 5])
B = np.array([1, 2.5, 3.5, 5])
scroll = np.linspace(1,5,11)
A_idx = np.searchsorted(A, scroll)
B_idx = np.searchsorted(B, scroll)
>>> scroll
array([1. , 1.4, 1.8, 2.2, 2.6, 3. , 3.4, 3.8, 4.2, 4.6, 5. ])
>>> A[A_idx]
array([1. , 1.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 5. , 5. ])
>>> B[B_idx]
array([1. , 2.5, 2.5, 2.5, 3.5, 3.5, 3.5, 5. , 5. , 5. , 5. ])
You might need to be careful with wether you use forward looking or backward looking values (here it is forward looking), but it will update when new data is available.

matplotlib axis don't converge properly

I make a graph using matplotlib and save it as a pdf. When I zoom in there is a gap where the x- and y-axis converge. Is there any way to get rid of this?
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3])
y = np.array([1, 2, 3])
plt.scatter(x, y)
plt.savefig('Scatter_Plot.pdf')
Unfortunately I can not upload pictures here - but here is a link:
http://de.tinypic.com/r/25gckcw/8
Thanks
I've updated matplotlib 1.3.1 -> 1.4.3
Now everything looks perfect!