Stacking multiple plots together with a single x-axis - matplotlib

Suppose I have multiple time dependent variables and I want to plot them all together stacked one of on top of another like the image below, how would I do so in matplotlib? Currently when I try plotting them they appear as multiple independent plots.
EDIT:
I have a Pandas dataframe with K columns corresponding to dependent variables and N rows corresponding to observed values for those K variables.
Sample code:
df = get_representation(mat) #df is the Pandas dataframe
for i in xrange(len(df.columns)):
plt.plot(df.ix[:,i])
plt.show()
I would like to plot them all one on top of another.

You could just stack all the curves by shifting each curve vertically:
df = get_representation(mat) #df is the Pandas dataframe
for i in xrange(len(df.columns)):
plt.plot(df.ix[:, i] + shift*i)
plt.show()
Here shift denotes the average distance between the curves.

Related

Getting the xlabels to reflect the DataFrame column

I have the attached DF that I am trying to plot however, the xvalues are the index from the DF when I would want them to be the actual episode names. Any recommendations here?DF pic
By first selecting the column from the DataFrame, and the plotting that you end up not "carrying over" those labels with you. You fix this two different ways:
plot from the DataFrame, not from the Series
top_episodes.plot(x="EpisodeTitle", y="Viewship(MM)", kind="bar")
Alternatively, set the index to be the EpisodeTitle, and then perform your column selection/plotting.
top_episodes.set_index("EpisodeTitle")["Viewship(MM)"].plot(kind="bar")

2D density plot using pandas and seaborn

I'm trying to plot a heatmap like:
(https://seaborn.pydata.org/generated/seaborn.kdeplot.html - last plot on the page)
But when I try this with my code I get:
My Pandas dataframe exists of two cols (x and y - both int64) and a number of rows. :
My code:
sns.kdeplot(data=dataFrame, fill=True, thresh=0, levels=100, cmap="mako", cbar=True)
My question is now how do I get rid of the contours, fill the background an make a smooth colorbar on the side?

What does ax=ax do while creating a plot in matplotlib?

I have a DataFrame of Heart Disease patients, which has over 300 values. What I have done initially is filter the patients aging over 50. Now I am trying to plot that DF, but running on Google, I found this piece of code that helped me plotting it.
But I am not able to understand the concept of ax = ax here:
fig, ax = plt.subplots()
over_50.plot(x="age",
y="chol",
c="target",
kind="scatter",
---------> ax=ax); <---------
I want to learn the concept behind this little piece of code here. What is it doing at its core?
In this case (a single axes plot) you can do without this parameter.
But there are more complex cases, when you create subplots with
a number of axes objects (a grid).
In this case ax (the second result from plt.subplots()) is an array
of axes objects.
Then, creating each plot, you should specify in which axes this plot
is to be created.
See e.g. https://matplotlib.org/3.1.0/gallery/subplots_axes_and_figures/subplots_demo.html
and find title Stacking subplots in one direction.
It contains such example:
fig, axs = plt.subplots(2)
fig.suptitle('Vertically stacked subplots')
axs[0].plot(x, y)
axs[1].plot(x, -y)
Here:
there is created a figure composed of 2 columns,
in the first axes there is created one line plot, and in the second - another plot.
Alternative form of how to specify axes object in which particular plot
is to be created is just ax parameter, like in our code,
where you can pass one of axes objects from the current figure.

3d seaborn lmplot using variable marker size

I have a pandas dataframe with three columns (A,B,C). I have drawn a regression line of A vs B using
sns.lmplot(x='A', y='B', data = df, x_bins=10, ci=None)
I am using 10 bins and no confidence interval as I have a large number (~5million) datapoints.
I would like to show the value of C on this plot. C has nothing to do with the regression of A against B. I would just like to show C by making the marker size of each bin equal to the average value of C within that bin.
It seems seaborn doesn't have a markersize parameter that can be set equal to a column of the dataframe. Is this even possible?
I cam across this stackexchange post which suggests using scatter_kws={"s": 100} to set the marker size. However, when I tried scatter_kws={"s": df['C']} it threw an error.
If this is not possible in seaborn, are there any alternative solutions?

Seaborn time series plotting: a different problem for each function

I'm trying to use seaborn dataframe functionality (e.g. passing column names to x, y and hue plot parameters) for my timeseries (in pandas datetime format) plots.
x should come from a timeseries column(converted from a pd.Series of strings with pd.to_datetime)
y should come from a float column
hue comes from a categorical column that I calculated.
There are multiple streams in the same series that I am trying to separate (and use the hue for separating them visually), and therefore they should not be connected by a line (like in a scatterplot)
I have tried the following plot types, each with a different problem:
sns.scatterplot: gets the plotting right and the labels right bus has problems with the xlimits, and I could not set them right with plt.xlim() using data.Dates.min and data.Dates.min
sns.lineplot: gets the limits and the labels right but I could not find a setting to disable the lines between the individual datapoints like in matplotlib. I tried the setting the markers and the dashes parameters to no avail.
sns.stripplot: my last try, plotted the datapoints correctly and got the xlimits right but messed the labels ticks
Example input data for easy reproduction:
dates = pd.to_datetime(('2017-11-15',
'2017-11-29',
'2017-12-15',
'2017-12-28',
'2018-01-15',
'2018-01-30',
'2018-02-15',
'2018-02-27',
'2018-03-15',
'2018-03-27',
'2018-04-13',
'2018-04-27',
'2018-05-15',
'2018-05-28',
'2018-06-15',
'2018-06-28',
'2018-07-13',
'2018-07-27'))
values = np.random.randn(len(dates))
clusters = np.random.randint(1, size=len(dates))
D = {'Dates': dates, 'Values': values, 'Clusters': clusters}
data = pd.DataFrame(D)
To each of the functions I am passing the same arguments:
sns.OneOfThePlottingFunctions(x='Dates',
y='Values',
hue='Clusters',
data=data)
plt.show()
So to recap, what I want is a plot that uses seaborn's pandas functionality, and plots points(not lines) with correct x limits and readable x labels :)
Any help would be greatly appreciated.
ax = sns.scatterplot(x='Dates', y='Values', hue='Clusters', data=data)
ax.set_xlim(data['Dates'].min(), data['Dates'].max())