I'm trying to plot time series data in matplotlib using a for loop. The goal is to dynamically plot 'n' years worth of daily closing price data. If i load 7 years of data, I get 7 unique plots. I have created a summary of the start and end dates for a data set, yearly_date_ranges (date is the index). I use this to populate start and end dates. The code I've written so far produces 7 plots of all daily data instead of 7 unique plots, one for each year. Any help is appreciated. Thanks in advance!
yearly_date_ranges
start end
Date
2014 2014-04-01 2014-12-31
2015 2015-01-01 2015-12-31
2016 2016-01-01 2016-12-31
2017 2017-01-01 2017-12-31
2018 2018-01-01 2018-12-31
2019 2019-01-01 2019-12-31
2020 2020-01-01 2020-05-28
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure(figsize=(12,20))
for i in range(len(yearly_date_ranges)):
ax = fig.add_subplot(len(yearly_date_ranges),1,i + 1)
for row in yearly_date_ranges.itertuples(index=False):
start = row.start
end = row.end
subset = data[start:end]
ax.plot(subset['Close'])
plt.show()
Dynamically you should do something like this:
fig, axes = plt.subplots(7,1, figsize=(12,20))
years = data.index.year
for ax, (k,d) in zip(axes.ravel(), data.groupby(years)):
d.plot(y='Close', ax=ax)
This worked! Thank you for the help
fig, axes = plt.subplots(7,1, figsize=(12,20))
years = data.index.year
for ax, (k,d) in zip(axes.ravel(), data['Close'].groupby(years)):
d.plot(x='Close', ax=ax)
Related
apple is a dataframe whose data structure is as the below:
apple
Date Open High Low Close Adj Close
0 2017-01-03 115.800003 116.330002 114.760002 116.150002 114.311760
1 2017-01-04 115.849998 116.510002 115.750000 116.019997 114.183815
2 2017-01-05 115.919998 116.860001 115.809998 116.610001 114.764473
3 2017-01-06 116.779999 118.160004 116.470001 117.910004 116.043915
4 2017-01-09 117.949997 119.430000 117.940002 118.989998 117.106812
5 2017-01-10 118.769997 119.379997 118.300003 119.110001 117.224907
6 2017-01-11 118.739998 119.930000 118.599998 119.750000 117.854782
7 2017-01-12 118.900002 119.300003 118.209999 119.250000 117.362694
8 2017-01-13 119.110001 119.620003 118.809998 119.040001 117.156021
9 2017-01-17 118.339996 120.239998 118.220001 120.000000 118.100822
Now i want to select two columns Date and Close ,to set Date as x axis and Close as y axis,how to plot it?
import pandas as pd
import matplotlib.pyplot as plt
x=pd.DataFrame({'key':apple['Date'],'data':apple['Close']})
x.plot()
plt.show()
I got the graph such as below.
The x axis is not Date column !
New DataFrame is not necessary, plot apple and use parameters x and y:
#if not datetime column first convert
#apple['Date'] = pd.to_datetime(apple['Date'])
apple.plot(x='Date', y='Close')
I have a dataframe that looks like below, the date is the index. How would I plot a time series showing a line for each of the years? I have tried df.plot(figsize=(15,4)) but this gives me one line.
Date Value
2008-01-31 22
2008-02-28 17
2008-03-31 34
2008-04-30 29
2009-01-31 33
2009-02-28 42
2009-03-31 45
2009-04-30 39
2019-01-31 17
2019-02-28 12
2019-03-31 11
2019-04-30 12
2020-01-31 24
2020-02-28 34
2020-03-31 43
2020-04-30 45
You can just do a groupby using year.
df = pd.read_clipboard()
df = df.set_index(pd.DatetimeIndex(df['Date']))
df.groupby(df.index.year)['Value'].plot()
In case you want to use year as series of data and compare day to day:
import matplotlib.pyplot as plt
# Create a date column from index (easier to manipulate)
df["date_column"] = pd.to_datetime(df.index)
# Create a year column
df["year"] = df["date_column"].dt.year
# Create a month-day column
df["month_day"] = (df["date_column"].dt.month).astype(str).str.zfill(2) + \
"-" + df["date_column"].dt.day.astype(str).str.zfill(2)
# Plot. Pivot will create for each year a column and these columns will be used as series.
df.pivot('month_day', 'year', 'Value').plot(kind='line', figsize=(12, 8), marker='o' )
plt.title("Values per Month-Day - Year comparison", y=1.1, fontsize=14)
plt.xlabel("Month-Day", labelpad=12, fontsize=12)
plt.ylabel("Value", labelpad=12, fontsize=12);
I'm new to pandas and all these dataframe. I am interested to know how I could transform my current codes to plt.figure instead. I would like to plot 2 columns (Tourism Receipts, Visitors) as line while putting another column as the x axis (Quarters).
It seems that this code works. But i would like to know whether there may be a better way to do it such as plt.plot but allowing me to set the x-axis as Quarters and the other 2 columns as lines?
df1= df.set_index('Quarters').plot(figsize=(10,5), grid=True)
Dataframe (from my csv file):
| Quarters | Tourism Receipts | Visitors |
| 2019 Q1 | 10 | 1 |
| 2019 Q2 | 20 | 2 |
| 2019 Q3 | 30 | 3 |
| 2019 Q4 | 40 | 4 |
I understand this following method
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(20,10))
plt.plot(x,y)
plt.title
plt.xlabel
plt.ylabel
I would like to enquire whether there is a way to do transform the 'df.set_index' method to plt instead?
You can actually combine both, using the .plot method which saves a lot of effort from pd and use matplotlib features side-by-side to customize the output.
This is a sample code of who to address this:
from matplotlib import pyplot as plt
import pandas as pd
fig, ax = plt.subplots(1, figsize=(10, 10))
df.set_index('Quarters')[['Tourism Receipts', 'Visitors']].plot(figsize=(10,5), grid=True, ax=ax)
ax.set_yticks(range(-10, 41, 5))
# ax.set_yticklabels( ('{}%'.format(x) for x in range(0, 101, 10)), fontsize=15)
ax.set_xticks(df.Quarters)
ax.set_xticklabels(["{} Q{}".format('2019', x) for x in df.Quarters])
ax.legend(loc='lower left')
You can do the same for yticks as well.
PS: The df.Quarters doesn't include year, so I am assuming 2019.
I create a bar plot like this:
But since each x axis label is one day of january (for example 1, 3, 4, 5, 7, 8, etc) I think the best way of showing this is something like
__________________________________________________ x axis
Jan 1 3 4 5 7 8 ...
2019
But I dont know how to do this with Pandas.
Here is my code:
import pandas as pd
import matplotlib.plt as plt
df = pd.read_excel('solved.xlsx', sheetname="Jan")
fig, ax = plt.subplots()
df_plot=df.plot(x='Date', y=['Solved', 'POT'], ax=ax, kind='bar',
width=0.9, color=('#ffc114','#0098c9'), label=('Solved','POT'))
def line_format(label):
"""
Convert time label to the format of pandas line plot
"""
month = label.month_name()[:3]
if month == 'Jan':
month += f'\n{label.year}'
return month
ax.set_xticklabels(map(lambda x: line_format(x), df.Date))
The function was a solution provided here: Pandas bar plot changes date format
I dont know how to modify it to get the axis I want
My data example solved.xlsx:
Date Q A B Solved POT
2019-01-04 Q4 11 9 14 5
2019-01-05 Q4 9 11 14 5
2019-01-08 Q4 11 18 10 6
2019-01-09 Q4 18 19 18 5
I have found a solution:
import pandas as pd
import matplotlib.plt as plt
df = pd.read_excel('solved.xlsx', sheetname="Jan")
fig, ax = plt.subplots()
df_plot=df.plot(x='Date', y=['Solved', 'POT'], ax=ax, kind='bar',
width=0.9, color=('#ffc114','#0098c9'), label=('Solved','POT'))
def line_format(label):
"""
Convert time label to the format of pandas line plot
"""
day = label.day
if day == 2:
day = str(day) + f'\n{label.month_name()[:3]}'
return day
ax.set_xticklabels(map(lambda x: line_format(x), df.Date))
plt.show()
In my particular case I didnt have the date 2019-01-01 . So the first day for me was Jan 2
I am plotting a graph between time and value. I want these time label to be shown as 2018-06-30 18:35:45 as in the main csv data.
Instead the graph on x axis shows time as 06:30 20.
How can the labels of x axis can be exact as mentioned in the main data.
Code I used is:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('Mlogi_ALL_idle_day.csv')
df['time']=pd.to_datetime(df['time'], unit='ns')
x = df['time']
y = df['d']
fig, ax = plt.subplots()
ax.plot_date(x, y, linestyle='-')
plt.title('Idle for 1 July - 2 July 2018')
plt.legend()
plt.ylabel('duration')
plt.xlabel('Time')
fig.autofmt_xdate()
plt.show()
And the data look like:
In[10]:df.head()
Out[10]:
time d
0 2018-06-30 18:35:45 41000000000
1 2018-06-30 18:36:47 44000000000
2 2018-06-30 18:37:46 43000000000
3 2018-06-30 18:38:46 40000000000
4 2018-06-30 18:39:47 43000000000
To change the date label of the major x-axis tick you can use
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d %H:%M:%S'))