Subplot multiindex data by level - pandas

This is my multiindex data.
Month Hour Hi
1 9 84.39
10 380.41
11 539.06
12 588.70
13 570.62
14 507.42
15 340.42
16 88.91
2 8 69.31
9 285.13
10 474.95
11 564.42
12 600.11
13 614.36
14 539.79
15 443.93
16 251.57
17 70.51
I want to make subplot where each subplot represent the Month. x axis is hour, y axis is Hi of the respective month.
This gives a beautiful approach as follow:
levels = df.index.levels[0]
fig, axes = plt.subplots(len(levels), figsize=(3, 25))
for level, ax in zip(levels, axes):
df.loc[level].plot(ax=ax, title=str(level))
plt.tight_layout()
I want to make 1x2 subplot instead of vertically arranged as above. Later, with larger data, I want to make 3x4 subplot or even larger dimension.
How to do it?

You can do it in pandas
df.Hi.unstack(0).fillna(0).plot(kind='line',subplots=True, layout=(1,2))

Pass the rows and columns arguments to plt.subplots
levels = df.index.levels[0]
# Number of rows v
fig, axes = plt.subplots(1, len(levels), figsize=(6, 3))
for level, ax in zip(levels, axes):
df.loc[level].plot(ax=ax, title=str(level))
plt.tight_layout()

Related

Matplotlib Plot X, Y Line Plot Multiple Columns Fixed X Axis

I'm trying to plot a df with the x axis forced to 12, 1, 2 for (Dec, Jan, Feb) and I cannot see how to do this. Matplot keeps wanting to plot the x axis in the 1,2,12 order. My DF (analogs_re) partial columns for the example looks like this:
Month 2000 1995 2009 2014 1994 2003
0 12 -0.203835 0.580590 0.233124 0.490193 0.605808 0.016756
1 1 -0.947029 -1.239794 -0.977004 0.207236 0.436458 -0.501948
2 2 -0.059957 0.708626 0.111840 0.422534 1.051873 -0.149000
I need the y data plotted with x axis in 12, 1, 2 order as shown in the 'Month" column.
My code:
fig, ax = plt.subplots()
#for name, group in groups:
analogs_re.set_index('Month').plot(figsize=(10,5),grid=True)
analogs_re.plot(x='Month', y=analogs_re.columns[1:len(analogs_re.columns)])
When you set Month as the x-axis then obviously it's going to plot it in numerical order (0, 1, 2, 3...), because a sequential series does not start with 12, then 1, then 2, ...
The trick is to use the original index as x-axis, then label those ticks using the month number:
fig, ax = plt.subplots()
analogs_re.drop(columns='Month').plot(figsize=(10,5), grid=True, ax=ax)
ax.set_xticks(analogs_re.index)
ax.set_xticklabels(analogs_re["Month"])

How to plot in pandas - Different x and different y axis in a same plot

I want to plot different values of x and y-axis from different CSVs into a simple plot.
csv1:
Time Buff
1 5
2 10
3 15
csv2:
Time1 Buff1
2 3
4 6
5 9
I have 5 different CSVs. I tried plotting to concatenate the dataframes into a single frame and plot it. But I was able to plot with only one x-axis:
df = pd.read_csv('csv1.txt)
df1 = pd.read_csv('csv2.txt)
join = pd.concat([df, df1], axis=1)
join.plot(x='Time', y=['Buff', 'Buff1'], kind='line')
join.plot(x='Time', y='Buff', x='Time1', y='Buff1') #doesn't work
I end up getting a plot with reference with only one x-axis (csv1). But how to plot both x and y column from the CSVs into the same plot?
you can plot two dataframes in the same axis if you specify that axis with ax=. Notice that I created the figure and axis using subplots before i plotted either of the dataframes.
import pandas as pd
import matplotlib.pyplot as plt
f,ax = plt.subplots()
df = pd.DataFrame({'Time':[1,2,3],'Buff':[5,4,3]})
df1 = pd.DataFrame({'Time1':[2,3,4],'Buff1':[5,7,8]})
df.plot(x='Time',y='Buff',ax=ax)
df1.plot(x='Time1',y='Buff1',ax=ax)

plot score against timestamp in pandas

I have a dataframe in pandas:
date_hour score
2019041822 -5
2019041823 0
2019041900 6
2019041901 -5
where date_hour is in YYYYMMDDHH format, and score is an int.
when I plot, there is a long line connecting 2019041823 to 2019041900, treating all the values in between as absent (ie. there is no score relating to 2019041824-2019041899, because there is no time relating to that).
Is there a way for these gaps/absetvalues to be ignored, so that it is continuous (Some of my data misses 2 days, so I have a long line which is misleading)
The red circles show the gap between nights (ie. between Apr 18 2300 and Apr 19 0000).
I used:
fig, ax = plt.subplots()
x=gpb['date_hour']
y=gpb['score']
ax.plot(x,y, '.-')
display(fig)
I believe it is because the date_hours is an int, and tried to convert to str, but was met with errors: ValueError: x and y must have same first dimension
Is there a way to plot so there are no gaps?
Try to convert date_hour to timestamp: df.date_hour = pd.to_datetime(df.date_hour, format='%Y%m%d%H') before plot.
df = pd.DataFrame({'date_hour':[2019041822, 2019041823, 2019041900, 2019041901],
'score':[-5,0,6,-5]})
df.date_hour = pd.to_datetime(df.date_hour, format='%Y%m%d%H')
df.plot(x='date_hour', y='score')
plt.show()
Output:
If you don't want to change your data, you can do
df = pd.DataFrame({'date_hour':[2019041822, 2019041823, 2019041900, 2019041901],
'score':[-5,0,6,-5]})
plt.plot(pd.to_datetime(df.date_hour, format='%Y%m%d%H'), df.score)
which gives:

how to plot a dataframe with two different axes in pandas matplotlib

So my data frame is like this:
6month final-formula numPatients6month
160243.0 1 0.401193 417
172110.0 2 0.458548 323
157638.0 3 0.369403 268
180306.0 4 0.338761 238
175324.0 5 0.247011 237
170709.0 6 0.328555 218
195762.0 7 0.232895 190
172571.0 8 0.319588 194
172055.0 9 0.415517 145
174609.0 10 0.344697 132
174089.0 11 0.402965 106
196130.0 12 0.375000 80
and I am plotting 6month, final-formula column
dffinal.plot(kind='bar',x='6month', y='final-formula')
import matplotlib.pyplot as plt
plt.show()
till now its ok, it shows 6month in the x axis and final-formula in the y-axis.
what I want is that to show the numPatients6month in the same plot, but in another y axis.
according to the below diagram. I want to show numPatients6month in the position 1, or simply show that number on above each bar.
I tried to conduct that by twinx, but it seems it is for the case we have two plot and we want to plot it in the same figure.
fig = plt.figure()
ax = fig.add_subplot(111)
ax2 = ax.twinx()
ax.set_ylabel('numPatients6month')
I appreciate your help :)
This is the solution that resolved it.I share here may help someone :)
ax=dffinal.plot(kind='bar',x='6month', y='final-formula')
import matplotlib.pyplot as plt
ax2 = ax.twinx()
ax2.spines['right'].set_position(('axes', 1.0))
dffinal.plot(ax=ax2,x='6month', y='numPatients6month')
plt.show()
Store the AxesSubplot in a variable called ax
ax = dffinal.plot(kind='bar',x='6month', y='final-formula')
and then
ax.tick_params(labeltop=False, labelright=True)
This will, bring the labels to the right as well.
Is this enough, or would you like to also know how to add values to the top of the bars? Because your question indicated, one of the two would satisfy.

Plot multiple lines in a line graph using matplotlib

I am trying to plot a line graph with several lines in it, one for each group.
X axis would be the hour and y axis would be the count.
Since there are 3 groups in the dataframe, i will have 3 lines in a single line graph.
This is the code I have used but not sure where I am going wrong.
Group Hour Count
G1 1 40
G2 1 300
G1 2 400
G2 2 80
G3 2 1211
Code used:
fig, ax = plt.subplots()
labels = []
for key, grp in df1.groupby(['Group']):
ax = grp.plot(ax=ax, kind='line', x='x', y='y', c=key)
labels.append(key)
lines, _ = ax.get_legend_handles_labels()
ax.legend(lines, labels, loc='best')
plt.show()
You can use df.pivot and save yourself some lines
df.pivot('Hour', 'Group', 'Count').plot(kind='line', marker='o')
G3 is plotted as a point because there is only one point (2 hrs, 1211 count) associated with it.