layout problem of multiple heatmaps in one figure with matplotlib - matplotlib

I put multiple heatmaps in one figure with matplotlib. I cannot layout it well. Here is my code.
import matplotlib; matplotlib.use('agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.random.rand(6,240,240)
y = np.random.rand(6,240,240)
t = np.random.rand(6,240,240)
plt.subplots_adjust(wspace=0.2, hspace=0.3)
c=1
for i in range(6):
ax=plt.subplot(6,3,c)
plt.imshow(x[i])
ax.set_title("x"+str(i))
c+=1
ax=plt.subplot(6,3,c)
plt.imshow(y[i])
ax.set_title("y"+str(i))
c+=1
ax=plt.subplot(6,3,c)
plt.imshow(t[i])
ax.set_title("t"+str(i))
c+=1
plt.tight_layout()
plt.savefig("test.png")
test.png looks like this.
I want to
make each heatmap bigger
reduce the margin between each heatmaps in row.
I tried to adjust by "subplots_adjust", but it doesn't work.
Additional information
According to ImportanceOfBeingErnest's comment, I removed tight_layout(). It generated this.
It makes bigger each heatmap, but titles overlappes on subplots. And I still want to make each heatmap more bigger, and I want to reduce the margin in row.

Related

How to fix lines of axes overlapping imshow plot?

When plotting matrices using matplotlib's imshow function the lines of the axes can overlap the actual plot, see the following minimal example (matshow is just a simple wrapper around imshow):
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(3,3))
ax.matshow(np.random.random((50, 50)), interpolation="none", cmap="Blues")
plt.savefig("example.png", dpi=300)
I would expect every entry of the matrix to be represented by a square, but in the top row it is quite obvious that the axis is hiding a bit of the plot resulting in non-square entries. The same is happening for the last column. Since I want the complete matrix to be seen - every entry with the same importance - is there any way this can be fixed?
To me, this is just a visualisation issue. If I run your code and maximise the window, I do not see the overlapping you are talking about:
Otherwise, remove the spines but without hiding the ticks:
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
EDIT
Reduce the thickness of the borders:
[x.set_linewidth(0.3) for x in ax.spines.values()]
The following is the exported image:
With 0.2 the exported image looks like this:

matplotlib find_peaks I need to ignore spureous peaks

I generate a graph like this, it measures the content of a transparent tube but the edges of the tube appear like that, you can see the peaks on the edges of the tube... Any suggestion on how to avoid this "extra" peaks? Any suggestion is welcome.
The image of the tube is this one
This is in fact not a matplotlib-specific question. From my understanding of your question, you would like to keep red peaks while remove blue ones. This task can be done by scipy.signal.find_peaks, you can specifiy a height value to control the peaks the algorithm finds. Here are some minimum code (adapted from scipy docs):
import matplotlib.pyplot as plt
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks
x = electrocardiogram()[2000:4000]
peaks, _ = find_peaks(x, height=0.5)
fig, ax = plt.subplots(1, 1, figsize=(7.2, 7.2))
ax.plot(x)
ax.plot(peaks, x[peaks], "x")
ax.axhline(y=0.5, ls="--", color="gray")

Geopandas & Mapplotlib, how do I plot without an outline around any shape?

When I run the code below in a Jupyter Notebook,
I get a map of the world, colored in red.
There are fine white-ish lines between the countries.
Is there a way to plot the world so that all countries
are solid and there's no line in between?
I'm asking, because my real world usecase is a fine grid that
behaves just like the world map: Each grid shape has a fine outline
which I do not want to have in the plot. (Update, since this was asked: The grid shapes will not have the same fill color.
)
import geopandas as gpd
import geoplot as gplt
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world['total'] = 1
world.plot(column='total', cmap='Set1')
For the grid example, the grid files are at https://opendata-esri-de.opendata.arcgis.com/datasets/3c1f46241cbb4b669e18b002e4893711_0
A simplified example that shows the problem.
sf = 'Hexagone_125_km/Hexagone_125_km.shp'
shp = gpd.read_file(sf)
shp.crs = {'init': 'epsg:4326'}
shp['sum'] = 1 # for example, fill sum with something
shp.plot(figsize=(20,20), column='sum', cmap='gnuplot', alpha=1, legend=True)
The white lines are due to antialiasing. This usually makes the visual more smooth, but leads to white lines in between different shapes. You can turn off anialiasing via
antialiased=False
That has the inevitable drawback of the plot looking pixelated.
An alternative is to give the patches an edge with a certain linewidth. The edges should probably have the same color as the faces, so
edgecolor="face", linewidth=0.4
would be an option. This removes the white lines, but introduces a slight "searing" effect (You'll notice mainly looking at islands like Indonesia or Japan). This will be the more noticable, the smaller the features, so it may be irrelevant for showing a hexbin plot. Still, playing a bit with the linewidth might improve the result further.
Code for reproduction:
import numpy as np; np.random.seed(42)
import geopandas as gpd
import matplotlib.pyplot as plt
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world['total'] = np.random.randint(0,10, size=len(world))
fig, (ax1, ax2, ax3) = plt.subplots(nrows=3, figsize=(7,10))
world.plot(column='total', cmap='Set1', ax=ax1)
world.plot(column='total', cmap='Set1', ax=ax2, antialiased=False)
world.plot(column='total', cmap='Set1', ax=ax3, edgecolor="face", linewidth=0.4)
ax1.set_title("original")
ax2.set_title("antialiased=False")
ax3.set_title("edgecolor='face', linewidth=0.4")
plt.tight_layout()
plt.savefig("world.png")
plt.show()

Basic axis malfuction in matplotlib

When plotting using matplotlib, I ran into an interesting issue where the y axis is scaled by a very inconvenient quantity. Here's a MWE that demonstrates the problem:
import numpy as np
import matplotlib.pyplot as plt
l = np.linspace(0.5,2,2**10)
a = (0.696*l**2)/(l**2 - 9896.2e-9**2)
plt.plot(l,a)
plt.show()
When I run this, I get a figure that looks like this picture
The y-axis clearly is scaled by a silly quantity even though the y data are all between 1 and 2.
This is similar to the question:
Axis numerical offset in matplotlib
I'm not satisfied with the answer to this question in that it makes no sense to my why I need to go the the convoluted process of changing axis settings when the data are between 1 and 2 (EDIT: between 0 and 1). Why does this happen? Why does matplotlib use such a bizarre scaling?
The data in the plot are all between 0.696000000017 and 0.696000000273. For such cases it makes sense to use some kind of offset.
If you don't want that, you can use you own formatter:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
l = np.linspace(0.5,2,2**10)
a = (0.696*l**2)/(l**2 - 9896.2e-9**2)
plt.plot(l,a)
fmt = matplotlib.ticker.StrMethodFormatter("{x:.12f}")
plt.gca().yaxis.set_major_formatter(fmt)
plt.show()

Cutting up the x-axis to produce multiple graphs with seaborn?

The following code when graphed looks really messy at the moment. The reason is I have too many values for 'fare'. 'Fare' ranges from [0-500] with most of the values within the first 100.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
titanic = sns.load_dataset("titanic")
y =titanic.groupby([titanic.fare//1,'sex']).survived.mean().reset_index()
sns.set(style="whitegrid")
g = sns.factorplot(x='fare', y= 'survived', col = 'sex', kind ='bar' ,data= y,
size=4, aspect =2.5 , palette="muted")
g.despine(left=True)
g.set_ylabels("Survival Probability")
g.set_xlabels('Fare')
plt.show()
I would like to try slicing up the 'fare' of the plots into subsets but would like to see all the graphs at the same time on one screen. I was wondering it this is possible without having to resort to groupby.
I will have to play around with the values of 'fare' to see what I would want each graph to represent, but for a sample let's use break up the graph into these 'fare' values.
[0-18]
[18-35]
[35-70]
[70-300]
[300-500]
So the total would be 10 graphs on one page, because of the juxtaposition with the opposite sex.
Is it possible with Seaborn? Do I need to do a lot of configuring with matplotlib? Thanks.
Actually I wrote a little blog post about this a while ago. If you are plotting histograms you can use the by keyword:
import matplotlib.pyplot as plt
import seaborn.apionly as sns
sns.set() #rescue matplotlib's styles from the early '90s
data = sns.load_dataset('titanic')
data.hist(by='class', column = 'fare')
plt.show()
Otherwise if you're just plotting value-counts, you have to roll your own grid:
def categorical_hist(self,column,by,layout=None,legend=None,**params):
from math import sqrt, ceil
if layout==None:
s = ceil(sqrt(self[column].unique().size))
layout = (s,s)
return self.groupby(by)[column]\
.value_counts()\
.sort_index()\
.unstack()\
.plot.bar(subplots=True,layout=layout,legend=None,**params)
categorical_hist(data, by='class', column='embark_town')
Edit If you want survival rate by fare range, you could do something like this
data.groupby(pd.cut(data.fare,10)).apply(lambda x.survived.sum(): x./len(x))