How to set the x-axis in a Seaborn Scatterplot - data-visualization

After plotting my data using seaborn scatterplot, it looks like this. Now in this graph most of the data appears in the year 2013-2017, so i tried to set the x-axis limit to 2013-2017 but i am not able to achieve the same
i tried this
g.set_xticks(range(1,7)) # <--- set the ticks first
g.set_xticklabels(['2012','2013','2014','2015','2016','2017'])
this
ax.set_xticks(['2012','2013','2014','2015','2016','2017','2018'])
and this
g.set(xlim=(pd.date_range('2012',periods=7)))
but i am not able to set my x-axis from 2013-2017, this is my code
fig,_ax=plt.subplots(1,1,figsize=(15,7))
sns.scatterplot(data=some_data_from_data_base,ax=_ax)
my kaggle notebook link

Related

Draw an ordinary plot with the same style as in plt.hist(histtype='step')

The method plt.hist() in pyplot has a way to create a 'step-like' plot style when calling
plt.hist(data, histtype='step')
but the 'ordinary' methods that plot raw data without processing (plt.plot(), plt.scatter(), etc.) apparently do not have style options to obtain the same result. My goal is to plot a given set of points using that style, without making histogram of these points.
Is that achievable with standard library methods for plotting a given 2-D set of points?
I also think that there is at least one hack (generating a fake distribution which would have histogram equal to our data) and a 'low-level' solution to draw each segment manually, but none of these ways seems favorable.
Maybe you are looking for drawstyle="steps".
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
data = np.cumsum(np.random.randn(10))
plt.plot(data, drawstyle="steps")
plt.show()
Note that this is slightly different from histograms, because the lines do not go to zero at the ends.

Style of error bar in pandas plot

I'd like to plot line chart with error bar with the following style.
However, pandas plot draws error bars with only vertical line.
pd.DataFrame([1,2,3]).plot(yerr=[0.3,.3,.3])
How do I change style of error bar for pandas plot?
The versions are:
pandas '0.18.0'
matplotlib '1.5.1'
Update
One of the reason seems using the seaborn style. The following code give the nice style plot.
# plt.style.use('seaborn-paper')
pd.DataFrame([1,2,3]).plot(yerr=[0.3,.3,.3],capsize=4)
But, I have a reason to keep using seaborn style... Please help.
You can change the capsize inline when you call plot on your DataFrame, using the capsize kwarg (which gets passed on to plt.errorbar):
pd.DataFrame([1,2,3]).plot(yerr=[0.3,.3,.3],capsize=4)
Alternatively, you can change this setting using rcParams
You can find out what your default errorbar cap size is by printing plt.rcParams['errorbar.capsize']. If that is 0 (which is why I suspect you are currently getting no errorbar caps), you can set the default size of the errorbar caps to something nonzero, using:
plt.rcParams['errorbar.capsize']=4
Make sure to have that at the beginning of any plotting script.
Update:
It seems using the seaborn-paper style sets the cap thickness to 0. You can override this with the capthick kwarg:
plt.style.use('seaborn-paper')
pd.DataFrame([1,2,3]).plot(yerr=[0.3,.3,.3],capsize=4,capthick=1)

matplotlib two charts side-by-side with third overlying the second chart

I am trying to use matplotlib (more specifically the plot method from pandas) to plot two charts side-by-side in an ipython notebook with a third chart overlying the second chart and using a secondary y axis. However, I have been unable to get the overlay to work.
Currently this is my code:
import matplotlib.pyplot as plt
%matplotlib inline
fig, axs = plt.subplots(1,2)
fig.set_size_inches(12, 4)
top10.plot(kind='barh', ax=axs[0])
top10_time_trend.T.plot(kind='bar', stacked=True, legend=False, ax=axs[1])
time_trend.plot(kind='line', ax=axs[1], ylim=0, secondary_y=True)
I get the side-by-side structure I am looking for, but only the first (top10) and last (time_trend) plots are visible. My output is below:
When plotted separately the unshown plot (top10_time_trend) looks like this
What I am trying to accomplish is something that looks like this, i.e. the line chart overlaying the stacked bar.
The best method to do this is by creating a third axis say:
ax3 = ax[1].twinx()
and then
top10_time_trend.T.plot(kind='bar', stacked=True, legend=False, ax=ax3)
Please let me know if this works for you.
Here you can find an example for the usage of twinx() from matplotlib docs http://matplotlib.org/examples/api/two_scales.html

Gridlines in Julia PyPlot

I'm using the "PyPlot" package in Julia, and I want to add gridlines at specified locations. I'm not familiar enough with Python/Matlab to use their documentation pages to help - the commands differ in Julia. I want a basic plot, with gridlines on both axes at intervals of 1:
using PyPlot
fig=figure("Name")
grid("on")
scatter([1,2,3,4],[4,5,6,7])
Help appreciated...
PyPlot is just an interface to Matplotlib, so the commands
to customize the grid are Matplotlib's commands.
One way to configure the gridlines on both axes at intervals of 1 (for the given data) is:
using PyPlot
fig=figure(figsize=[6,3])
ax1=subplot(1,1,1) # creates a subplot with just one graphic
ax1[:xaxis][:set_ticks](collect(1:4)) # configure x ticks from 1 to 4
ax1[:yaxis][:set_ticks](collect(4:7)) # configure y ticks from 4 to 7
grid("on")
scatter([1,2,3,4],[4,5,6,7])
This code was tested inside an IJulia's notebook, and produces the following output:
Take a look at Various Julia plotting examples using PyPlot.
tested with Julia Version 0.4.3
The values where grid lines are drawn can be controlled by passing an array to the xticks() and yticks() functions.
A simple example:
using PyPlot
fig=figure("Name")
grid("on")
xticks(0:5)
yticks(3:8)
scatter([1,2,3,4],[4,5,6,7])
If you want it to be more flexible you can figure out the limits based on your data and set the tick interval to something else.
One little more dynamic way to configure the x-axis of the grid could be:
x_data = [1,2,3,4]
x_tick_interval = 2;
x_tick_start = minimum(xdata)
x_tick_end = maximum(xdata)
xticks(x_tick_start:x_tick_interval:x_tick_end)

Plotting a stock chart with Pandas in IPython

I'm just getting started with Python/Pandas and put together the following code to plot the S&P 500.
from pandas.io.data import DataReader
# returns a DataFrame
sp500 = DataReader("^GSPC", "yahoo", start=datetime.datetime(1990, 1, 1))
sp500.plot(figsize = (12,8))
It looks like this is plotting the high, low, open, close, adj close, and volume all on one graph. Is there an easy way to plot just the adj close in one graph and the volume right below it like they do on Yahoo and most other financial sites? I'd also be interested in an example of plotting an OHLC candlestick chart.
See here for my answer to a similar question and here for further information regarding mathplotlib's finance candlestick graph.
To get just the adj close from your sp500, you would use something like sp500["Adj Close"] and then pass that to the relevant matplotlib plot command plt.plot(datelist, sp500["Adj Close"] ) where datelist is your list of dates on the x axis.
I believe you can get datelist by referencing sp500.index, see here for more information.
As for your issue with passing it to the plot command, something like
datelist = [date2num(x) for x in sp500.index] where the function date2num is from matplotlib.dates package.
After setting up the relevant subplot, and then call the appropriate fill command to fill_between_alpha the area under the line like the Yahoo graph you linked to.
See here under the Fill Between and Alpha heading for another snippet that shows a filled line graph, with correct date printing.
The initial link has a sample matplotlib snippet which also covers the date format and formatting in more detail.
This gets you a plot of just the Adj Close column vs the Index of you DataFrame (Date).
from pandas.io.data import DataReader
sp500 = DataReader("^GSPC", "yahoo", start=datetime.datetime(1990, 1, 1))
close = sp500['Adj Close']
close.plot()
plt.show()
Likewise, You can plot the Volume the same way:
vol = sp500['Volume']
vol.plot()
plt.show()