How to add a legend to a figure - pandas

I would like to add a legend into my figure. This is the code i used to print the plot:
fig1, ax1 = plt.subplots()
ax1.legend("Test Legend")
ax1.plot(singleColumn)
This is the plot i get:
fig1 plot
If i use this code:
print(singleColumn.plot(legend=True))
Perfect Plot
I will get a perfect plot with a legend, but if i plot other figures, all plots are mixed up, so i would prefer the fig method.
How i add the Name of the Data into the fig1 plot ?
Best regards
Kai
edit:
The sinmgleColumn looks like that, i displayed the head:
[5 rows x 13 columns]
MESSDATUM
2011-01-10 2.40
2011-02-02 2.17
2011-03-03 2.32
2011-04-06 1.67
2011-05-04 2.56
Name: 2433876, dtype: float64

Related

Shift Matplotlib Axes to Match Overlaid Plots

I have a plot that is an overlay of a boxplot and a scatter plot (red data points). Everything is fine except that the red data points are not lining up on the x axis with the boxplot x axis. I think this is because two different plot methods are used on the same axis?? I have noticed that the second method plot of "scatter" effectively shifts the boxplots to the right on the x axis. Plotting the boxplot without the scatter plot does not shift the x axis values as shown below. The method i'm using below should work but it is not. Here is my code:
sitenames = df3.plant_name.unique().tolist()
months = ['JANUARY','FEBRUARY','MARCH','APRIL','MAY','JUNE','JULY','AUGUST','SEPTEMBER','OCTOBER','NOVEMBER','DECEMBER']
from datetime import datetime
monthn = datetime.now().month
newList = list()
for i in range(monthn-1):
newList.append(months[(monthn-1+i)%12])
print(newList)
for i, month in enumerate(newList,1):
#plt.figure()
fig, ax = plt.subplots()
ax = df3[df3['month']==i].boxplot(by='plant_name',column='Var')
df3c[df3c['month']==i].plot(kind='scatter', x='plant_name', y='Var',ax=ax,color='r',label='CY')
plt.xticks(rotation=90, ha='right')
plt.suptitle('1991-2020 ERA5 WIND PRODUCTION',y=1)
plt.title(months[i-1])
plt.xlabel('SITE')
plt.ylabel('VARIABILITY')
plt.legend()
plt.show()
Here is the plot from "February" that shows the mis-aligned x axis:
Here are partial rows for df3, df3c:
df3.head(3)
Out[223]:
plant_name month year power_kwh power_kwh_rhs Var
0 BII NEE STIPA 1 1991 11905.826075 14673.281223 -18.9
1 BII NEE STIPA 1 1992 14273.927688 14673.281223 -2.7
2 BII NEE STIPA 1 1993 12559.828360 14673.281223 -14.4
df3c.head(3)
Out[224]:
plant_name month year power_kwh power_kwh_rhs Var
0 BII NEE STIPA 1 2021.0 14863.643952 14673.281223 1.3
1 BII NEE STIPA 2 2021.0 9663.393155 12388.328084 -22.0
2 DOS ARBOLITOS 1 2021.0 36819.502285 36882.205762 -0.2
I have found a similar problem but can't see how to insert this solution to my code: Shift matplotlib axes to match eachother

How to change the data for x-axis and y-axis in sns.distplot

I get a pd.series as follows:
train_df['area']
0 68.06
1 125.55
2 132.00
3 57.00
4 129.00
5 223.35
6 78.94
7 76.00
Name: area, dtype: float64
And I draw the sns.distplot() of it, but I get a plot as follows:
BUT, what I really want is that x-axis and y-axis should be [0, 1,...,7] and [40, 60,..., 240].
I am wondering how to fix it?.
If not mind, could anyone help me?
Thanks in advance.
Try to use plt's axis:
fig, ax = plt.subplots(1,1)
# pass ax here
sns.distplot(..., ax=ax)
ax.set_xticklabels(range(8))
ax.set_yticklabels(ax.get_yticks()*20000+40)
plt.show()
Output:

Is it possible to create a matplotlib barplot of timed values using simple notation?

Is the DataFrame['2002':'2005'][['Value1','Value2']].bar(any args?) possible way to create a bar plot of Dataframe, where 2 values are distributed within a long period of time.
I can create a simple plot, but I want bars (1 bar - 1 day).
If there is no such a simple way, what would be the simpliest one?
Well, not too self coding, but anyway, possible, so it works... (thanks to these guys here...)
Suppose you have a dataframe like
df.head()
temperature pressure
2018-10-01 21.860016 1031.418143
2018-10-02 20.590761 1063.008550
2018-10-03 21.356381 1047.183300
2018-10-04 20.393329 1037.710172
2018-10-05 20.716377 1027.680324
... ... ...
Then you'd need to import
import matplotlib.pyplot as plt
import matplotlib.dates as md
to plot your data like
fig, ax = plt.subplots()
ax.xaxis_date()
ax.bar(df.index -md.num2timedelta(.2), df.temperature, .4, color='r', label = df.temperature.name)
ax2 = ax.twinx()
ax2.bar(df.index +md.num2timedelta(.2), df.pressure, .4, color='b', label = df.pressure.name)
ax.xaxis.set_major_formatter(md.DateFormatter('%b %d'))
ax.xaxis.set_major_locator(md.WeekdayLocator())
fig.legend()
Result:

how to plot a dataframe with two different axes in pandas matplotlib

So my data frame is like this:
6month final-formula numPatients6month
160243.0 1 0.401193 417
172110.0 2 0.458548 323
157638.0 3 0.369403 268
180306.0 4 0.338761 238
175324.0 5 0.247011 237
170709.0 6 0.328555 218
195762.0 7 0.232895 190
172571.0 8 0.319588 194
172055.0 9 0.415517 145
174609.0 10 0.344697 132
174089.0 11 0.402965 106
196130.0 12 0.375000 80
and I am plotting 6month, final-formula column
dffinal.plot(kind='bar',x='6month', y='final-formula')
import matplotlib.pyplot as plt
plt.show()
till now its ok, it shows 6month in the x axis and final-formula in the y-axis.
what I want is that to show the numPatients6month in the same plot, but in another y axis.
according to the below diagram. I want to show numPatients6month in the position 1, or simply show that number on above each bar.
I tried to conduct that by twinx, but it seems it is for the case we have two plot and we want to plot it in the same figure.
fig = plt.figure()
ax = fig.add_subplot(111)
ax2 = ax.twinx()
ax.set_ylabel('numPatients6month')
I appreciate your help :)
This is the solution that resolved it.I share here may help someone :)
ax=dffinal.plot(kind='bar',x='6month', y='final-formula')
import matplotlib.pyplot as plt
ax2 = ax.twinx()
ax2.spines['right'].set_position(('axes', 1.0))
dffinal.plot(ax=ax2,x='6month', y='numPatients6month')
plt.show()
Store the AxesSubplot in a variable called ax
ax = dffinal.plot(kind='bar',x='6month', y='final-formula')
and then
ax.tick_params(labeltop=False, labelright=True)
This will, bring the labels to the right as well.
Is this enough, or would you like to also know how to add values to the top of the bars? Because your question indicated, one of the two would satisfy.

Tick labels overlap in pandas bar chart

TL;DR: In pandas how do I plot a bar chart so that its x axis tick labels look like those of a line chart?
I made a time series with evenly spaced intervals (one item each day) and can plot it like such just fine:
intensity[350:450].plot()
plt.show()
But switching to a bar chart created this mess:
intensity[350:450].plot(kind = 'bar')
plt.show()
I then created a bar chart using matplotlib directly but it lacks the nice date time series tick label formatter of pandas:
def bar_chart(series):
fig, ax = plt.subplots(1)
ax.bar(series.index, series)
fig.autofmt_xdate()
plt.show()
bar_chart(intensity[350:450])
Here's an excerpt from the intensity Series:
intensity[390:400]
2017-03-07 3
2017-03-08 0
2017-03-09 3
2017-03-10 0
2017-03-11 0
2017-03-12 0
2017-03-13 2
2017-03-14 0
2017-03-15 3
2017-03-16 0
Freq: D, dtype: int64
I could go all out on this and just create the tick labels by hand completely but I'd rather not have to baby matplotlib and let do pandas its job and do what it did in the very first figure but with a bar plot. So how do I do that?
Pandas bar plots are categorical plots. They create one tick (+label) for each category. If the categories are dates and those dates are continuous one may aim at leaving certain dates out, e.g. to plot only every fifth category,
ax = series.plot(kind="bar")
ax.set_xticklabels([t if not i%5 else "" for i,t in enumerate(ax.get_xticklabels())])
In contrast, matplotlib bar charts are numberical plots. Here a useful ticker can be applied, which ticks the dates weekly, monthly or whatever is needed.
In addition, matplotlib allows to have full control over the tick positions and their labels.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import dates
index = pd.date_range("2018-01-26", "2018-05-05")
series = pd.Series(np.random.rayleigh(size=100), index=index)
plt.bar(series.index, series.values)
plt.gca().xaxis.set_major_locator(dates.MonthLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter("%b\n%Y"))
plt.show()