Matplotlib Legend in For Loop - pandas

Im am trying to plot multiple lines with their corresponding legend:
regions = ['Wales', 'Scotland', 'London', 'East of England', 'East Midlands',
'Yorkshire and The Humber', 'South East', 'South West',
'West Midlands', 'North West', 'North East']
plt.figure(figsize = (10,8))
plt.title('Number of Vehicles per Region')
plt.xlabel('Year')
plt.ylabel('Number of Vehicles')
plt.legend()
for i in regions:
region = raw_miles_df.loc[i].sum(axis = 1).reset_index()
region = region.rename(columns = {'count_date':'Year', 0: 'vehicles'})
region['Year'] = region['Year'].apply(lambda x: x.year)
region = region.groupby(['Year']).agg(vehicles = ('vehicles', lambda x: x.mean().round(2)))
plt.plot(region)
plt.legend(i)
the method i have is not working:

You need to move plt.legend out of the loop and make it plt.legend(regions). As you can see in the legend, it is treating the string 'North East', which is the last item in regions, as an iterable from which to draw the categories.
But you can make it easier on yourself by using seaborn
import seaborn as sns
# aggregate your data outside of the loop
# then call lineplot
aggdata = df.groupby(...)
sns.lineplot(x=x_column, y=y_column, hue=category_column, data=aggdata)

Related

Stacked Bar Chart matplotlib or seaborn

I have the following dataframe:
import pandas as pd
data = {'country': ['US', 'DE', 'IT', 'US', 'DE', 'IT', 'US', 'DE', 'IT'],
'year': [2000,2000,2000,2001,2001,2001,2002,2002,2002],
'share': [0.5, 0.3, 0.2, 0.6,0.1,0.3,0.4,0.2,0.4]}
data = pd.DataFrame(data)
I want to display the data with a stacked bar chart.
X-axis: year,
Y-axis: share,
Color: country
All the three bars for 2000, 2001 and 2002 should have the same height (for each year, the total of the share == 1)
You can use a pivot and plot.bar with stacked=True:
data.pivot('year', 'country', 'share').plot.bar(stacked=True)
output:

bi-directional bar chart with annotation in python plotly

I have a pandas dataset with a toy version that can be created with this
#creating a toy pandas dataframe
s1 = pd.Series(['dont have a mortgage',-31.8,'have mortgage',15.65])
s2 = pd.Series(['have utility bill arrears',-21.45,'',0])
s3 = pd.Series(['have interest only mortgage',-19.59,'',0])
s4 = pd.Series(['bank with challenger bank',-19.24,'bank with a traditional bank',32.71])
df = pd.DataFrame([list(s1),list(s2),list(s3),list(s4)], columns = ['label1','value1','label2','value2'])
I want to create a bar chart that looks like this version I hacked together in excel
I want to be able to supply RGB values to customise the two colours for the left and right bars (currently blue and orange)
I tried different versions using “fig.add_trace(go.Bar” but am brand new to plotly and cant get anything to work with different coloured bars on one row with annotation under each bar.
All help greatly appreciated!
thanks
To create a double-sided bar chart, you can create two subplots with shared x- and y-axis. Each subplot is a horizontal bar chart with a specified marker color
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
# define data set
s1 = pd.Series(['dont have a mortgage',-31.8,'have mortgage',15.65])
s2 = pd.Series(['have utility bill arrears',-21.45,'',0])
s3 = pd.Series(['have interest only mortgage',-19.59,'',0])
s4 = pd.Series(['bank with challenger bank',-19.24,'bank with a traditional bank',32.71])
df = pd.DataFrame([list(s1),list(s2),list(s3),list(s4)], columns = ['label1','value1','label2','value2'])
# create subplots
fig = make_subplots(rows=1, cols=2, specs=[[{}, {}]], shared_xaxes=True,
shared_yaxes=True, horizontal_spacing=0)
fig.append_trace(go.Bar(y=df.index, x=df.value1, orientation='h', width=0.4, showlegend=False, marker_color='#4472c4'), 1, 1)
fig.append_trace(go.Bar(y=df.index, x=df.value2, orientation='h', width=0.4, showlegend=False, marker_color='#ed7d31'), 1, 2)
fig.update_yaxes(showticklabels=False) # hide all yticks
The annotations need to be added separately:
annotations = []
for i, row in df.iterrows():
if row.label1 != '':
annotations.append({
'xref': 'x1',
'yref': 'y1',
'y': i,
'x': row.value1,
'text': row.value1,
'xanchor': 'right',
'showarrow': False})
annotations.append({
'xref': 'x1',
'yref': 'y1',
'y': i-0.3,
'x': -1,
'text': row.label1,
'xanchor': 'right',
'showarrow': False})
if row.label2 != '':
annotations.append({
'xref': 'x2',
'yref': 'y2',
'y': i,
'x': row.value2,
'text': row.value2,
'xanchor': 'left',
'showarrow': False})
annotations.append({
'xref': 'x2',
'yref': 'y2',
'y': i-0.3,
'x': 1,
'text': row.label2,
'xanchor': 'left',
'showarrow': False})
fig.update_layout(annotations=annotations)
fig.show()

How to add markers on legend and graph - matplotlib

I have the following code:
from matplotlib import pyplot as plt
import seaborn as sns
fig = plt.figure()
fig.suptitle('Average GPA and Standard Deviation per course combination', fontsize=15)
plt.xlabel('Standard Deviation of Average GPA', fontsize=12)
plt.ylabel('Average GPA', fontsize=12)
colors = ['#E74C3C', '#76448A', '#3498DB', '#17A589', '#F1C40F', '#F39C12', '#CA6F1E', '#B3B6B7', '#34495E',
'#F5B7B1']
marker = ['.','v','^','1','2','8','p','P','x','X']
g = sns.scatterplot(x=all_stdev, y=all_gpas, hue=final_courses)
g.legend(loc='center left', bbox_to_anchor=(1, 0.5), ncol=1)
plt.show()
the all_stdev, all_gpas, final_courses are lists and change everytime based on the user, since these are recommendations for the user based on input. The result I get is the following for a particular student:
I tried putting markers in order for them to be more easy to understand from the user(the results) but no matter the things I tried I did not manage to do it. The markers would appear in the graph with same color and the legend would still have all colors as shown above. I need to add the markers to the graph and legend as well. Is there a way to do it? I have a list with markers that I would like to use in the code provided.
You need to add the markers and the colors as a parameter to the scatterplot.
There still is another problem with the markers. Seaborn complains: Filled and line art markers cannot be mixed. So you need to select either filled or line art markers.
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
fig = plt.figure()
fig.suptitle('Average GPA and Standard Deviation per course combination', fontsize=15)
plt.xlabel('Standard Deviation of Average GPA', fontsize=12)
plt.ylabel('Average GPA', fontsize=12)
colors = ['#E74C3C', '#76448A', '#3498DB', '#17A589', '#F1C40F', '#F39C12', '#CA6F1E', '#B3B6B7', '#34495E', '#F5B7B1']
# marker = ['.', 'v', '^', '1', '2', '8', 'p', 'P', 'x', 'X']
marker = ['o', 'v', '^', '8', '*', 'P', 'D', 'X', 's', 'p']
N = 30
final_courses = np.random.randint(1,11, N) * 10
all_stdev = np.random.uniform(0, 2, N)
all_gpas = np.random.uniform(3, 4, N)
g = sns.scatterplot(x=all_stdev, y=all_gpas, hue=final_courses, style=final_courses, palette=colors, markers=marker)
g.legend(loc='center left', bbox_to_anchor=(1, 0.5), ncol=1)
plt.show()

How can we can give names to each states on the basemap?

Can any help me out how can we give names to each states in the map itself. I am able plot the data points but not able name the sates as i am using the shapefile. If would have been plotting totally through the lats and longs i would have taken the mean of each state and used plt.text(x.y,statename) but i dont know with shapefiles..
Below is the python code using matplotlib library.....?
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap as Basemap
from matplotlib.colors import rgb2hex
from matplotlib.patches import Polygon
# Lambert Conformal map of lower 48 states.
fig=plt.figure(figsize=(15,9))
m = Basemap(llcrnrlon=-119,llcrnrlat=22,urcrnrlon=-64,urcrnrlat=49,
projection='lcc',lat_1=33,lat_2=45,lon_0=-95,resolution='c')
# draw state boundaries.
# data from U.S Census Bureau
# http://www.census.gov/geo/www/cob/st2000.html
shp_info = m.readshapefile('cb_2018_us_state_20m','states',drawbounds=True)
# population density by state from
# http://en.wikipedia.org/wiki/List_of_U.S._states_by_population_density
popdensity = {
'New Jersey': 438.00,
'Rhode Island': 387.35,
'Massachusetts': 312.68,
'Connecticut': 271.40,
'Maryland': 209.23,
'New York': 155.18,
'Delaware': 154.87,
'Florida': 114.43,
'Ohio': 107.05,
'Pennsylvania': 105.80,
'Illinois': 86.27,
'California': 83.85,
'Hawaii': 72.83,
'Virginia': 69.03,
'Michigan': 67.55,
'Indiana': 65.46,
'North Carolina': 63.80,
'Georgia': 54.59,
'Tennessee': 53.29,
'New Hampshire': 53.20,
'South Carolina': 51.45,
'Louisiana': 39.61,
'Kentucky': 39.28,
'Wisconsin': 38.13,
'Washington': 34.20,
'Alabama': 33.84,
'Missouri': 31.36,
'Texas': 30.75,
'West Virginia': 29.00,
'Vermont': 25.41,
'Minnesota': 23.86,
'Mississippi': 23.42,
'Iowa': 20.22,
'Arkansas': 19.82,
'Oklahoma': 19.40,
'Arizona': 17.43,
'Colorado': 16.01,
'Maine': 15.95,
'Oregon': 13.76,
'Kansas': 12.69,
'Utah': 10.50,
'Nebraska': 8.60,
'Nevada': 7.03,
'Idaho': 6.04,
'New Mexico': 5.79,
'South Dakota': 3.84,
'North Dakota': 3.59,
'Montana': 2.39,
'Wyoming': 1.96,
'Alaska': 0.42}
# choose a color for each state based on population density.
colors={}
statenames=[]
cmap = plt.cm.hot # use 'hot' colormap
vmin = 0; vmax = 450 # set range.
for shapedict in m.states_info:
statename = shapedict['NAME']
# skip DC and Puerto Rico.
if statename not in ['District of Columbia','Puerto Rico']:
pop = popdensity[statename]
# calling colormap with value between 0 and 1 returns
# rgba value. Invert color range (hot colors are high
# population), take sqrt root to spread out colors more.
colors[statename] = cmap(1.-np.sqrt((pop-vmin)/(vmax-vmin)))[:3]
statenames.append(statename)
# cycle through state names, color each one.
ax = plt.gca() # get current axes instance
for nshape,seg in enumerate(m.states):
#skip DC and Puerto Rico.
if statenames[nshape] not in ['District of Columbia','Puerto Rico']:
color = rgb2hex(colors[statenames[nshape]])
poly = Polygon(seg,facecolor=color,edgecolor=color)
plt.text()
ax.add_patch(poly)
plt.title('Filling States with Density of Merchants')
x, y = m(-80.2416355,37.7652076 )
plt.plot(x, y, 'ok', markersize=10)
plt.show()

Different scatterplot markers with the same label

I am having 'similar' issues to Matplotlib, legend with multiple different markers with one label. I was able to achieve the following thanks to this question Combine two Pyplot patches for legend.
fig = pylab.figure()
figlegend = pylab.figure(figsize=(3,2))
ax = fig.add_subplot(111)
point1 = ax.scatter(range(3), range(1,4), 250, marker=ur'$\u2640$', label = 'S', edgecolor = 'green')
point2 = ax.scatter(range(3), range(2,5), 250, marker=ur'$\u2640$', label = 'I', edgecolor = 'red')
point3 = ax.scatter(range(1,4), range(3), 250, marker=ur'$\u2642$', label = 'S', edgecolor = 'green')
point4 = ax.scatter(range(2,5), range(3), 250, marker=ur'$\u2642$', label = 'I', edgecolor = 'red')
figlegend.legend(((point1, point3), (point2, point4)), ('S','I'), 'center', scatterpoints = 1, handlelength = 1)
figlegend.show()
pylab.show()
However, my two (venus and mars) markers overlap in the legend. I tried playing with handlelength, but that doesn't seem to help. Any suggestions or comments would be helpful.
A possible workaround is to create a two column legend with blank labels in the first column:
figlegend.legend((point1, point2, point3, point4), (' ', ' ', 'S', 'I'),
'center', scatterpoints = 1, ncol = 2)
Here's my work-around MWE. I actually plot two extra "plots", point_g and point_r which have the legend handles we will use. I then cover them up by using a white squre marker. Plot the remaining plots as desired.
import matplotlib.pyplot as plt
plt.rc('text', usetex=True)
plt.rc('text', **{'latex.preamble': '\\usepackage{wasysym}'})
plt.rc('lines', **{'markersize':20})
fig = plt.figure()
point_g, = plt.plot((0,), (0,), ls='none', marker='$\\male\\female$', mec='g')
point_r, = plt.plot((0,), (0,), ls='none', marker='$\\male\\female$', mec='r')
plt.plot((0,), (0,), marker='s', mec='w', mfc='w')
plt.plot(range(3), range(1,4), ls='none', marker='$\\male$', mec='g')
plt.plot(range(3), range(2,5), ls='none', marker='$\\male$', mec='r')
plt.plot(range(1,4), range(3), ls='none', marker='$\\female$', mec='g')
plt.plot(range(2,5), range(3), ls='none', marker='$\\female$', mec='r')
plt.axis([-0.1, 4.1, -0.1, 4.1])
plt.legend((point_g, point_r), ('Green', 'Red'), markerscale=1.6, numpoints=1,
borderpad=0.8, handlelength=3, labelspacing=1)
plt.show()
Note: You do not need the LaTeX preamble if you use unicode symbols. I couldn't get them working on my system (Linux) so I used the LaTeX symbols. This method will work with all symbols, just remove the plt.rc commands and change \\male and \\female to your unicode characters.