Calling specific dates and values in stock prices - pandas

How do i call data for the dates and values for any shares?
Example: I want to call the stock price and date for apple shares only for dec 2016, dec 2017.
Here is what I've tried:
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
import pandas_datareader.data as web
import numpy as np
from matplotlib import style
import matplotlib.pyplot as plt
import datetime as dt
start = dt.datetime(2013,10,1)
end= dt.datetime(2018,4,30)
AAPL_data=[]
AAPL= web.DataReader('AAPL','iex', start, end)
AAPL_data.append(AAPL)

AAPL.loc['2016-12-01':'2016-12-31']
AAPL.loc['2017-12-01':'2017-12-31']
both work for me, for December '16/'17 data respectively. Pandas datetime indexing is very flexible and intuitive, see more examples here.
Incidentally, if you're using the new release of pandas-datareader (0.7.0), you no longer need to insert pd.core.common.is_list_like = pd.api.types.is_list_like.

Related

Plotting Closing price of SBIN NSE but it is plotting 3:30pm-9:15am also

the dataframe only have time from 9:15 am to 3:30pm every working day. but when it is getting plotted as chart, matplotlib is plotting times between 3:30 to 9:15 next day now tell the solution
can't figure out how to get continuous figure & here is the csv
i tried using
import matplotlib.pyplot as plt
import pandas as pd
#data = the read file in the link
data = pd.read_csv('sbin.csv')
plt.plot(data['MA_50'], label='MA 50', color='red')
plt.plot(data['MA_10'], label='MA 10', color='blue')
plt.legend(loc='best')
plt.xlim(data.index[0], data.index[-1])
plt.xlabel('Time')
plt.ylabel('Price')plt.show()
I expect again 9:15 after 3:30
Have you tried using mplfinance ?
Using the data you posted:
import mplfinance as mpf
import pandas as pd
df = pd.read_csv('sbin.csv', index_col=0, parse_dates=True)
mpf.plot(df, type='candle', ema=(10,50), style='yahoo')
The result:

How to display negative x values on the left side for barplot?

I would like to ask question regarding to barplot for seaborn.
I have a dataset returned from bigquery and converted to dataframe as below.
Sample data from `df.sort_values(by=['dep_delay_in_minutes']).to_csv(csv_file)`
,dep_delay_in_minutes,arrival_delay_in_minutes,numflights
1,-50.0,-38.0,2
2,-49.0,-59.5,4
3,-46.0,-28.5,4
4,-45.0,-44.0,4
5,-43.0,-53.0,4
6,-42.0,-35.0,6
7,-40.0,-26.0,4
8,-39.0,-33.5,4
9,-38.0,-21.5,4
10,-37.0,-37.666666666666664,12
11,-36.0,-35.0,2
12,-35.0,-32.57142857142857,14
13,-34.0,-30.0,18
14,-33.0,-26.200000000000003,10
15,-32.0,-34.8,10
16,-31.0,-28.769230769230766,26
17,-30.0,-34.93749999999999,32
18,-29.0,-31.375000000000004,48
19,-28.0,-24.857142857142854,70
20,-27.0,-28.837209302325583,86
I wrote the code as below but the negative value is plotted on right hand side .
import matplotlib.pyplot as plt
import seaborn as sb
import pandas as pd
import numpy as np
import google.datalab.bigquery as bq
import warnings
# Disable warnings
warnings.filterwarnings('ignore')
sql="""
SELECT
DEP_DELAY as dep_delay_in_minutes,
AVG(ARR_DELAY) AS arrival_delay_in_minutes,
COUNT(ARR_DELAY) AS numflights
FROM flights.simevents
GROUP BY DEP_DELAY
ORDER BY DEP_DELAY
"""
df = bq.Query(sql).execute().result().to_dataframe()
df = df.sort_values(['dep_delay_in_minutes'])
ax = sb.barplot(data=df, x='dep_delay_in_minutes', y='numflights', order=df['dep_delay_in_minutes'])
ax.set_xlim(-50, 0)
How can I display x axis as numeric order with negative values on left hand side ?
I appreciate if I could get some adice.
It doesn't work to specify left and right with ax.set_xlim(). It was displayed well with only one specification.
import matplotlib.pyplot as plt
import seaborn as sb
import pandas as pd
import numpy as np
df = df.sort_values(['dep_delay_in_minutes'])
ax = sb.barplot(x='dep_delay_in_minutes', y='numflights', data=df, order=df['dep_delay_in_minutes'])
ax.set_xlim(0.0)
labels = ax.get_xticklabels()
plt.setp(labels, rotation=45)
plt.show()
A different notation was also possible.
ax.set_xlim(0.0,)

Matplotlib float values on the axis instead of integers

I have the following code that shows the following plot. I can't get to show the fiscal year correctly on the x axis and it's showing as if they are float. I tried to do the astype(int) and it didn't work. Any ideas on what I am doing wrong?
p1 = plt.bar(list(asset['FISCAL_YEAR']),list(asset['TOTAL']),align='center')
plt.show()
This is the plot:
In order to make sure only integer locations obtain a ticklabel, you may use a matplotlib.ticker.MultipleLocator with an integer number as argument.
To then format the numbers on the axes, you may use a matplotlib.ticker.StrMethodFormatter.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
df = pd.DataFrame({"FISCAL_YEAR" : np.arange(2000,2017),
'TOTAL' : np.random.rand(17)})
plt.bar(df['FISCAL_YEAR'],df['TOTAL'],align='center')
locator = matplotlib.ticker.MultipleLocator(2)
plt.gca().xaxis.set_major_locator(locator)
formatter = matplotlib.ticker.StrMethodFormatter("{x:.0f}")
plt.gca().xaxis.set_major_formatter(formatter)
plt.show()

How can I convert pandas date time xticks to readable format?

I am plotting a time series with a date time index. The plot needs to be a particular size for the journal format. Consequently, the sticks are not readable since they span many years.
Here is a data sample
2013-02-10 0.7714492098202259
2013-02-11 0.7709101833765016
2013-02-12 0.7704911332770049
2013-02-13 0.7694975914173087
2013-02-14 0.7692108921323576
The data is a series with a datetime index and spans from 2013 to 2016. I use
data.plot(ax = ax)
to plot the data.
How can I format my xticks to read like '13 instead of 2013?
It seems there is some incompatibility between pandas and matplotlib formatters/locators when it comes to dates. See e.g. those questions:
Pandas plot - modify major and minor xticks for dates
Pandas Dataframe line plot display date on xaxis
I'm not entirely sure why it still works in some cases to use matplotlib formatters and not in others. However because of those issues, the bullet-proof solution is to use matplotlib to plot the graph instead of the pandas plotting function.
This allows to use locators and formatters just as seen in the matplotlib example.
Here the solution to the question would look as follows:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
dates = pd.date_range("2013-01-01", "2017-06-20" )
y = np.cumsum(np.random.normal(size=len(dates)))
s = pd.Series(y, index=dates)
fig, ax = plt.subplots()
ax.plot(s.index, s.values)
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_minor_locator(mdates.MonthLocator())
yearFmt = mdates.DateFormatter("'%y")
ax.xaxis.set_major_formatter(yearFmt)
plt.show()
According to this example, you can do the following
import matplotlib.dates as mdates
yearsFmt = mdates.DateFormatter("'%y")
years = mdates.YearLocator()
ax = df.plot()
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
Full work below
Add word value so pd.read_clipboard puts dates into index
value
2013-02-10 0.7714492098202259
2014-02-11 0.7709101833765016
2015-02-12 0.7704911332770049
2016-02-13 0.7694975914173087
2017-02-14 0.7692108921323576
Then read in data and convert index
df = pd.read_clipboard(sep='\s+')
df.index = pd.to_datetime(df.index)

Pandas, Bokeh, or using any plotting library for shifting the x-axis for seasonal data (months 7 -> 12 -> 6 or July 01 - June 30)

I want to display seasonal snow data for the seasonal year from July 01 - June 30.
df = pd.DataFrame({'date1':['1954-03-20','1955-02-23','1956-01-01','1956-11-21','1958-01-07'],
'date2':['1954-03-25','1955-02-26','1956-02-11','1956-11-30','1958-01-17']},
index=['1954','1955','1956','1957','1958'])
It is an extension to my previous question Pandas: Visualizing Changes in Event Dates for Multiple Years using Bokeh or any other plotting library
Scott Boston, in his answer to my comment in that question, suggested using Range1D and modifyng the answer in How can I accomplish `set_xlim` or `set_ylim` in Bokeh?. It works for continuous scalars, but I couldn't get it to work with a discontinuous ranges like [182:366], [1:181].
Adding x_range=Range1d(182, 366) shows me the first half of the seasonal year, but I can't get the second half of the seasonal year (1, 181).
df['date2'] = pd.to_datetime(df['date2'])
df['date1'] = pd.to_datetime(df['date1'])
df=df.assign(date2_DOY=df.date2.dt.dayofyear)
df=df.assign(date1_DOY=df.date1.dt.dayofyear)
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.models import FuncTickFormatter, FixedTicker
p1 = figure(plot_width=1000, plot_height=300,x_range=Range1d(180, 366))
p1.circle(df.date1_DOY,df.index, color='red', legend='Date1')
p1.circle(df.date2_DOY,df.index, color='green', legend='Date2')
p1.xaxis[0].ticker=FixedTicker(ticks=[1,32,60,91,121,152,182,213,244,274,305,335,366])
p1.xaxis.formatter = FuncTickFormatter(code="""
var labels = {'1':'Jan',32:'Feb',60:'Mar',91:'Apr',121:'May',152:'Jun',182:'Jul',213:'Aug',244:'Sep',274:'Oct',305:'Nov',335:'Dec',366:'Jan'}
return labels[tick];
""")
show(p1)
#(Code from Scott's answer to my previous question.)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({'date1':['1954-03-20','1955-02-23','1956-01-01','1956-11-21','1958-01-07'],
'date2':['1954-03-25','1955-02-26','1956-02-11','1956-11-30','1958-01-17']},
index=['1954','1955','1956','1957','1958'])
df['date2'] = pd.to_datetime(df['date2'])
df['date1'] = pd.to_datetime(df['date1'])
Adjusted mapping day of year to axis starting with Jun 1st.
df['date2_DOY_map'] = np.where(df['date2'].dt.dayofyear<151,df['date2'].dt.dayofyear-151+365,df['date2'].dt.dayofyear-151)
df['date1_DOY_map'] = np.where(df['date1'].dt.dayofyear<151,df['date1'].dt.dayofyear-151+365,df['date1'].dt.dayofyear-151)
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
Add Range1d import form bokeh.models
from bokeh.models import FuncTickFormatter, FixedTicker,Range1d
p1 = figure(plot_width=1000, plot_height=300,x_range=Range1d(1, 366))
p1.circle(df.date1_DOY_map,df.index, color='red', legend='Date1')
p1.circle(df.date2_DOY_map,df.index, color='green', legend='Date2')
Fixed x-ticks and labels to match Jun 1st start
p1.xaxis[0].ticker=FixedTicker(ticks=[1,31,62,93,123,154,184,215,246,274,305,335,366])
p1.xaxis.formatter = FuncTickFormatter(code="""
var labels = {'1':'Jun',31:'Jul',62:'Aug',93:'Sep',123:'Oct',154:'Nov',184:'Dec',215:'Jan',246:'Feb',274:'Mar',305:'Apr',335:'May',366:'Jun'}
return labels[tick];
""")
show(p1)
EDIT (oops I started on the wrong month above)
Pretty easy to fix, we just need to modify the x-axis mapping of dates and redo the ticks and lables.
Use 181 vs 151 because Jul 1st is the 181st day where June 1st was the 151st day.
df['date2_DOY_map'] = np.where(df['date2'].dt.dayofyear<181,df['date2'].dt.dayofyear-181+365,df['date2'].dt.dayofyear-181)
df['date1_DOY_map'] = np.where(df['date1'].dt.dayofyear<181,df['date1'].dt.dayofyear-181+365,df['date1'].dt.dayofyear-181)
p1.xaxis[0].ticker=FixedTicker(ticks=[1,32,63,93,124,154,185,216,244,275,305,336,366])
p1.xaxis.formatter = FuncTickFormatter(code="""
var labels = {'1':'Jul',32:'Aug',63:'Sep',93:'Oct',124:'Nov',154:'Dec',185:'Jan',216:'Feb',244:'Mar',275:'Apr',305:'May',336:'Jun',366:'Jul'}
return labels[tick];
""")