Geopandas reduce legend size (and remove white space below map) - legend

I would like to know how to change the legend automatically generated by Geopandas. Mostly I would like to reduce its size because it's quite big on the generated image. The legend seems to take all the available space.
Additional question, do you know how to remove the empty space below my map ? I've tried with
pad_inches = 0, bbox_inches='tight'
but I still have an empty space below the map.
Thanks for your help.

This works for me:
some_geodataframe.plot(..., legend=True, legend_kwds={'shrink': 0.3})
Other options here: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.colorbar.html

To show how to get proper size of a colorbar legend accompanying a map created by geopandas' plot() method I use the built-in 'naturalearth_lowres' dataset.
The working code is as follows.
import matplotlib.pyplot as plt
import geopandas as gpd
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world = world[(world.name != "Antarctica") & (world.name != "Fr. S. Antarctic Lands")] # exclude 2 no-man lands
plot as usual, grab the axes 'ax' returned by the plot
colormap = "copper_r" # add _r to reverse the colormap
ax = world.plot(column='pop_est', cmap=colormap, \
figsize=[12,9], \
vmin=min(world.pop_est), vmax=max(world.pop_est))
map marginal/face deco
ax.set_title('World Population')
ax.grid()
colorbar will be created by ...
fig = ax.get_figure()
# add colorbar axes to the figure
# here, need trial-and-error to get [l,b,w,h] right
# l:left, b:bottom, w:width, h:height; in normalized unit (0-1)
cbax = fig.add_axes([0.95, 0.3, 0.03, 0.39])
cbax.set_title('Population')
sm = plt.cm.ScalarMappable(cmap=colormap, \
norm=plt.Normalize(vmin=min(world.pop_est), vmax=max(world.pop_est)))
at this stage, 'cbax' is just a blank axes, with un needed labels on x and y axes blank-out the array of the scalar mappable 'sm'
sm._A = []
draw colorbar into 'cbax'
fig.colorbar(sm, cax=cbax, format="%d")
# dont use: plt.tight_layout()
plt.show()
Read the comments in the code for useful info.
The resulting plot:

Related

How to draw a grid in a bar-plot created with plt.vlines()

I want to create a bar-plot in python. I want this plot to be beautiful though and I don't like the looks of python's axes.bar() function. Therefore, I have decided to use plt.vlines(). The challenge here is that my x-data is a list that contains strings and not numerical data. When I plot my graph, the spacing between the two columns (in my example column 2 = 0) is pretty big:
Furthermore, I want a grid. However, I would like to have minor grid lines as well. I know how to get all of this if my data was numerical. But since my x-data contains strings, I don't know how to set x_max. Any suggestions?
Internally, the positions of the labels are numbered 0,1,... So setting the x-limits a bit before 0 and after the last, shows them more centered.
Usually, bars are drawn with their 'feet' on the ground, which can be set via plt.ylim(0, ...). Minor ticks can be positioned for example at multiples of 0.2. Setting the length of the ticks to zero lets the position count for the grid, but suppresses the tick mark.
from matplotlib import pyplot as plt
from matplotlib.ticker import MultipleLocator
import numpy as np
labels = ['Test 1', 'Test 2']
values = [1, 0.7]
fig, ax = plt.subplots()
plt.vlines(labels, 0, values, colors='dodgerblue', alpha=.4, lw=7)
plt.xlim(-0.5, len(labels) - 0.5) # add some padding left and right of the bars
plt.ylim(0, 1.1) # bars usually have their 0 at the bottom
ax.xaxis.set_minor_locator(MultipleLocator(.2))
plt.tick_params(axis='x', which='both', length=0) # ticks not shown, but position serves for gridlines
plt.grid(axis='both', which='both', ls=':') # optionally set the linestyle of the grid
plt.show()

Second Matplotlib figure doesn't save to file

I've drawn a plot that looks something like the following:
It was created using the following code:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
# 1. Plot a figure consisting of 3 separate axes
# ==============================================
plotNames = ['Plot1','Plot2','Plot3']
figure, axisList = plt.subplots(len(plotNames), sharex=True, sharey=True)
tempDF = pd.DataFrame()
tempDF['date'] = pd.date_range('2015-01-01','2015-12-31',freq='D')
tempDF['value'] = np.random.randn(tempDF['date'].size)
tempDF['value2'] = np.random.randn(tempDF['date'].size)
for i in range(len(plotNames)):
axisList[i].plot_date(tempDF['date'],tempDF['value'],'b-',xdate=True)
# 2. Create a new single axis in the figure. This new axis sits over
# the top of the axes drawn previously. Make all the components of
# the new single axis invisibe except for the x and y labels.
big_ax = figure.add_subplot(111)
big_ax.set_axis_bgcolor('none')
big_ax.set_xlabel('Date',fontweight='bold')
big_ax.set_ylabel('Random normal',fontweight='bold')
big_ax.tick_params(labelcolor='none', top='off', bottom='off', left='off', right='off')
big_ax.spines['right'].set_visible(False)
big_ax.spines['top'].set_visible(False)
big_ax.spines['left'].set_visible(False)
big_ax.spines['bottom'].set_visible(False)
# 3. Plot a separate figure
# =========================
figure2,ax2 = plt.subplots()
ax2.plot_date(tempDF['date'],tempDF['value2'],'-',xdate=True,color='green')
ax2.set_xlabel('Date',fontweight='bold')
ax2.set_ylabel('Random normal',fontweight='bold')
# Save plot
# =========
plt.savefig('tempPlot.png',dpi=300)
Basically, the rationale for plotting the whole picture is as follows:
Create the first figure and plot 3 separate axes using a loop
Plot a single axis in the same figure to sit on top of the graphs
drawn previously. Label the x and y axes. Make all other aspects of
this axis invisible.
Create a second figure and plot data on a single axis.
The plot displays just as I want when using jupyter-notebook but when the plot is saved, the file contains only the second figure.
I was under the impression that plots could have multiple figures and that figures could have multiple axes. However, I suspect I have a fundamental misunderstanding of the differences between plots, subplots, figures and axes. Can someone please explain what I'm doing wrong and explain how to get the whole image to save to a single file.
Matplotlib does not have "plots". In that sense,
plots are figures
subplots are axes
During runtime of a script you can have as many figures as you wish. Calling plt.save() will save the currently active figure, i.e. the figure you would get by calling plt.gcf().
You can save any other figure either by providing a figure number num:
plt.figure(num)
plt.savefig("output.png")
or by having a refence to the figure object fig1
fig1.savefig("output.png")
In order to save several figures into one file, one could go the way detailed here: Python saving multiple figures into one PDF file.
Another option would be not to create several figures, but a single one, using subplots,
fig = plt.figure()
ax = plt.add_subplot(611)
ax2 = plt.add_subplot(612)
ax3 = plt.add_subplot(613)
ax4 = plt.add_subplot(212)
and then plot the respective graphs to those axes using
ax.plot(x,y)
or in the case of a pandas dataframe df
df.plot(x="column1", y="column2", ax=ax)
This second option can of course be generalized to arbitrary axes positions using subplots on grids. This is detailed in the matplotlib user's guide Customizing Location of Subplot Using GridSpec
Furthermore, it is possible to position an axes (a subplot so to speak) at any position in the figure using fig.add_axes([left, bottom, width, height]) (where left, bottom, width, height are in figure coordinates, ranging from 0 to 1).

How to pass different scatter kwargs to facets in lmplot in Seaborn

I'm trying to map a 3rd variable to the scatter point colour in the Seaborn lmplot. So total_bill on x, tip on y and point colour as function of size.
It works when no faceting is enabled but fails when col is used because the colour array size does not match the size of the data plotted in each facet.
This is my code
import matplotlib as mpl
import seaborn as sns
sns.set(color_codes=True)
# load data
data = sns.load_dataset("tips")
# size of data
print len(data.index)
### we want to plot scatter point colour as function of variable 'size'
# first, sort the data by 'size' so that high 'size' values are plotted
# over the smaller sizes (so they are more visible)
data = data.sort_values(by=['size'], ascending=True)
scatter_kws = dict()
cmap = mpl.cm.get_cmap(name='Blues')
# normalise 'size' variable as float range needs to be
# between 0 and 1 to map to a valid colour
scatter_kws['c'] = data['size'] / data['size'].max()
# map normalised values to colours
scatter_kws['c'] = cmap(scatter_kws['c'].values)
# colour array has same size as data
print len(scatter_kws['c'])
# this works as intended
g = sns.lmplot(data=data, x="total_bill", y="tip", scatter_kws=scatter_kws)
The above works well and produces the following (not allowed to include images yet, so here's the link):
lmplot with point colour as function of size
However, when I add col='sex' to lmplot (try code below), the issue is that the colour array has the size of the original dataset which is larger than the size of data plotted in each facet. So, for example col='male' has 157 data points so first 157 values from the colour array are mapped to the points (and these aren't even the correct ones). See below:
lmplot with point colour as function of size with col=sex
g = sns.lmplot(data=data, x="total_bill", y="tip", col="sex", scatter_kws=scatter_kws)
Ideally, I'd like to pass an array of scatter_kws to the lmplot so that each facet uses the correct colour array (which I'd calculate before passing to lmplot). But that doesn't seem to be an option.
Any other ideas or workarounds that still allow me to use the functionality of Seaborn's lmplot (meaning, without resorting to re-creating lmplot functionality from FacetGrid?
In principle the lmplot with different cols seems to be just a wrapper for several regplots. So instead of one lmplot we could use two regplots, one for each sex.
We therefore need to separate the original dataframe into male and female, the rest is rather straight forward.
import matplotlib.pyplot as plt
import seaborn as sns
data = sns.load_dataset("tips")
data = data.sort_values(by=['size'], ascending=True)
# make a new dataframe for males and females
male = data[data["sex"] == "Male"]
female = data[data["sex"] == "Female"]
# get normalized colors for all data
colors = data['size'].values / float(data['size'].max())
# get colors for males / females
colors_male = colors[data["sex"].values == "Male"]
colors_female = colors[data["sex"].values == "Female"]
# colors are values in [0,1] range
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(9,4))
#create regplot for males, put it to left axes
#use colors_male to color the points with Blues cmap
sns.regplot(data=male, x="total_bill", y="tip", ax=ax1,
scatter_kws= {"c" : colors_male, "cmap":"Blues"})
# same for females
sns.regplot(data=female, x="total_bill", y="tip", ax=ax2,
scatter_kws={"c" : colors_female, "cmap":"Greens"})
ax1.set_title("Males")
ax2.set_title("Females")
for ax in [ax1, ax2]:
ax.set_xlim([0,60])
ax.set_ylim([0,12])
plt.tight_layout()
plt.show()

How to add a legend to matplotlib pie chart?

Using this example http://matplotlib.org/examples/pie_and_polar_charts/pie_demo_features.html
how could I add a legend to this pie chart?
My problem is that I have One big slice 88.4%, the second largest slice is 10.6%, and the other slices are 0.7 and 0.3%. The labels around the pie don't appear (except for the biggest slice) and neither the percentage values for the smaller slices. So I guess I can add a legend showing the names and the values. But I haven't found out how...
# -*- coding: UTF-8 -*-
import matplotlib.pyplot as plt
# The slices will be ordered and plotted counter-clockwise.
labels = 'Rayos X', 'RMN en solución', 'Microscopía electrónica', 'Otros'
sizes = [88.4, 10.6, 0.7, 0.3]
colors = ['yellowgreen', 'gold', 'lightskyblue', 'lightcoral']
explode = (0.1, 0, 0, 0)
plt.pie(sizes, explode=explode, labels=labels, colors=colors, shadow=True, startangle=90)
plt.legend(title="técnica")
# Set aspect ratio to be equal so that pie is drawn as a circle.
plt.axis('equal')
plt.show()
I checked your code, and the plt.legend() creates a legend, just how you want it to be; maybe set the loc="lower left", so it does not overlap with the relevant pieces of pie.
For me, the strings are displayed properly, besides the non standard chars - which might cause the problem that they are not displayed to you at all. Only the biggest slice and "Otros" do not contain special chars. Maybe also try to resize the figure, as they might be pushed out of the canvas. Please refer to how to write accents with matplotlib and try again with proper strings.
The percentages are not shown, because you did not set them to be shown. Please refer to the example posted by you, as you omitted autopct='%1.1f%%'which will plot the percentages. In this special case, I would rather not plot the percentages, as they will overlap just like the labels on the border, as some slices are too small. Maybe add these information to the legend.
Putting it all together (besides the special chars - I had some problems activating TeX), try the following code:
# -*- coding: UTF-8 -*-
import matplotlib.pyplot as plt
# The slices will be ordered and plotted counter-clockwise.
labels = [r'Rayos X (88.4 %)', r'RMN en solucion (10.6 %)',
r'Microscopia electronica (0.7 %)', r'Otros (0.3 %)']
sizes = [88.4, 10.6, 0.7, 0.3]
colors = ['yellowgreen', 'gold', 'lightskyblue', 'lightcoral']
patches, texts = plt.pie(sizes, colors=colors, startangle=90)
plt.legend(patches, labels, loc="best")
# Set aspect ratio to be equal so that pie is drawn as a circle.
plt.axis('equal')
plt.tight_layout()
plt.show()
You can change your legend following types-
best
upper right
upper left
lower left
lower right
right
center left
center right
lower center
upper center
center
state = stateData['State/UnionTerritory']
cases = stateData['ConfirmedIndianNational']
explode = stateData.ConfirmedIndianNational.apply(lambda x:x > 100)
explode = explode.apply(lambda x:0.2 if x == True else 0)
plt.title("Covid 19")
plt.pie(cases, explode=explode,autopct='%1.2f%%',shadow=True, radius=3)
plt.legend(state, loc="center")
plt.show()

Reducing the distance between two boxplots

I'm drawing the bloxplot shown below using python and matplotlib. Is there any way I can reduce the distance between the two boxplots on the X axis?
This is the code that I'm using to get the figure above:
import matplotlib.pyplot as plt
from matplotlib import rcParams
rcParams['ytick.direction'] = 'out'
rcParams['xtick.direction'] = 'out'
fig = plt.figure()
xlabels = ["CG", "EG"]
ax = fig.add_subplot(111)
ax.boxplot([values_cg, values_eg])
ax.set_xticks(np.arange(len(xlabels))+1)
ax.set_xticklabels(xlabels, rotation=45, ha='right')
fig.subplots_adjust(bottom=0.3)
ylabels = yticks = np.linspace(0, 20, 5)
ax.set_yticks(yticks)
ax.set_yticklabels(ylabels)
ax.tick_params(axis='x', pad=10)
ax.tick_params(axis='y', pad=10)
plt.savefig(os.path.join(output_dir, "output.pdf"))
And this is an example closer to what I'd like to get visually (although I wouldn't mind if the boxplots were even a bit closer to each other):
You can either change the aspect ratio of plot or use the widths kwarg (doc) as such:
ax.boxplot([values_cg, values_eg], widths=1)
to make the boxes wider.
Try changing the aspect ratio using
ax.set_aspect(1.5) # or some other float
The larger then number, the narrower (and taller) the plot should be:
a circle will be stretched such that the height is num times the width. aspect=1 is the same as aspect=’equal’.
http://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes.set_aspect
When your code writes:
ax.set_xticks(np.arange(len(xlabels))+1)
You're putting the first box plot on 0 and the second one on 1 (event though you change the tick labels afterwards), just like in the second, "wanted" example you gave they are set on 1,2,3.
So i think an alternative solution would be to play with the xticks position and the xlim of the plot.
for example using
ax.set_xlim(-1.5,2.5)
would place them closer.
positions : array-like, optional
Sets the positions of the boxes. The ticks and limits are automatically set to match the positions. Defaults to range(1, N+1) where N is the number of boxes to be drawn.
https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.boxplot.html
This should do the job!
As #Stevie mentioned, you can use the positions kwarg (doc) to manually set the x-coordinates of the boxes:
ax.boxplot([values_cg, values_eg], positions=[1, 1.3])