Trying to plot two lines on mathplotlib - pandas

I am tyring two plot who lines but cannot get the graph to scale properly
#plot2
fig=plt.figure()
ax1=fig.add_subplot(111)
ax1.plot(totalReturnComp["Date"], totalReturnComp["S&P compounded growth"], linewidth=2, color='b')
ax2=ax1.twiny()
ax2.plot(sp_trr["month"], sp_trr["1 MEqual Weight"], linewidth=2, color='g')
ax2.legend(['3 Month Incremental Weight'],frameon=False)
ax1.legend(['S&P Index compounded growth'],frameon=False)
plt.title('3 Month Incremental Weight vs S&P Index')
Plot
plot

Related

confusion matrix plot is unable to show values within plot

I'm trying to plot confusion matrix, but the plot seems unable to show values at the center.
Code :
plt.figure(figsize=(4,2))
sns.heatmap(cm, annot=True, annot_kws={"size":20},
cmap='Blues', square=True, fmt='.2f')
plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.title('Confusion matrix for:\n{}'.format(model.__class__.__name__));
Output :
How to have the values 0.45, 0.94, 0.06 and 0.55 to be at the center of square box for better readability ?
The figure size appears to be too small - 4x2. Also, you are making square=True, the size is not right.
As I dont have the same model, I tried it with another model and this is the code and output
plt.figure(figsize=(4,4))
sns.heatmap((metrics.confusion_matrix(y_test, svmGridtest_predict)),annot=True,
annot_kws={"size":20}, fmt='.2f', square=True, cmap='Blues')
plt.xlabel('Predicted')
plt.ylabel('Actuals',rotation=0)
plt.title('Confusion matrix for:\n{}'.format('SVM'))
OUTPUT figure here...

Matplotlib x ticks label not regularly spaced

I am plotting some data using seaborn as below:
ax1 = plt.subplot(2, 1, 1)
sns.lineplot(x='time', y='gap_ratio', data=df_gap, ax=ax1)
I am wondering why the x-ticks labels are not regularly spaced. As a result, I am ending with overlapping x-tick labels. Is there any way to get this fixed easily ?
FYI, the x data is regularly spaced (1 data point every minute)

Seaborn legend modification for multiple overlapping plots

I am trying to create a seaborn boxplot and overlay with individual data points using seaborn swarmplot for a dataset that has two categorical variables (Nameplate Capacity and Scenario) and one continuous variable (ELCC values). Since I have two overlaying plots in seaborn, it is generating two legends for the same variables being plotted. How do I plot a box plot along with a swarm plot while only showing the legend from the box plot. My current code looks like:
plt.subplots(figsize=(25,18))
sns.set_theme(style = "whitegrid", font_scale= 1.5 )
ax = sns.boxplot(x="Scenario", y="ELCC", hue = "Nameplate Capacity",
data=final_offshore, palette = "Pastel1")
ax = sns.swarmplot(x="Scenario", y="ELCC", hue = "Nameplate Capacity", dodge=True, marker='D', size =9, alpha=0.35, data=final_offshore, color="black")
plt.xlabel('Scenarios')
plt.ylabel('ELCC values')
plt.title('Contribution of ad-hoc offshore generator in each scenario')
My plot so far:
You can draw your box plot, get that legend, draw the swarm plot and then re-draw the legend:
# Draw the bar chart
ax = sns.boxplot(
x="Scenario",
y="ELCC",
hue="Nameplate Capacity",
data=final_offshore,
palette="Pastel1",
)
# Get the legend from just the box plot
handles, labels = ax.get_legend_handles_labels()
# Draw the swarmplot
sns.swarmplot(
x="Scenario",
y="ELCC",
hue="Nameplate Capacity",
dodge=True,
marker="D",
size=9,
alpha=0.35,
data=final_offshore,
color="black",
ax=ax,
)
# Remove the old legend
ax.legend_.remove()
# Add just the handles/labels from the box plot back
ax.legend(
handles,
labels,
loc=0,
)

Pandas: How can I plot with separate y-axis, but still control the order?

I am trying to plot multiple time series in one plot. The scales are different, so they need separate y-axis, and I want a specific time series to have its y-axis on the right. I also want that time series to be behind the others. But I find that when I use secondary_y=True, this time series is always brought to the front, even if the code to plot it comes before the others. How can I control the order of the plots when using secondary_y=True (or is there an alternative)?
Furthermore, when I use secondary_y=True the y-axis on the left no longer adapts to appropriate values. Is there a fixed for this?
# imports
import numpy as np
import matplotlib.pyplot as plt
# dummy data
lenx = 1000
x = range(lenx)
np.random.seed(4)
y1 = np.random.randn(lenx)
y1 = pd.Series(y1, index=x)
y2 = 50.0 + y1.cumsum()
# plot time series.
# use ax to make Pandas plot them in the same plot.
ax = y2.plot.area(secondary_y=True)
y1.plot(ax=ax)
So what I would like is to have the blue area plot behind the green time series, and to have the left y-axis take appropriate values for the green time series:
https://i.stack.imgur.com/6QzPV.png
Perhaps something like the following using matplotlib.axes.Axes.twinx instead of using secondary_y, and then following the approach in this answer to move the twinned axis to the background:
# plot time series.
fig, ax = plt.subplots()
y1.plot(ax=ax, color='green')
ax.set_zorder(10)
ax.patch.set_visible(False)
ax1 = ax.twinx()
y2.plot.area(ax=ax1, color='blue')

Scatter plot without x-axis

I am trying to visualize some data and have built a scatter plot with this code -
sns.regplot(y="Calls", x="clientid", data=Drop)
This is the output -
I don't want it to consider the x-axis. I just want to see how the data lie w.r.t y-axis. Is there a way to do that?
As #iayork suggested, you can see the distribution of your points with a striplot or a swarmplot (you could also combine them with a violinplot). If you need to move the points closer to the y-axis, you can simply adjust the size of the figure so that the width is small compared to the height (here i'm doing 2 subplots on a 4x5 in figure, which means that each plot is roughly 2x5 in).
fig, (ax1,ax2) = plt.subplots(1,2, figsize=(4,5))
sns.stripplot(d, orient='vert', ax=ax1)
sns.swarmplot(d, orient='vert', ax=ax2)
plt.tight_layout()
However, I'm going to suggest that maybe you want to use distplot instead. This function is specifically created to show the distribution of you data. Here i'm plotting the KDE of the data, as well as the "rugplot", which shows the position of the points along the y-axis:
fig = plt.figure()
sns.distplot(d, kde=True, vertical=True, rug=True, hist=False, kde_kws=dict(shade=True), rug_kws=dict(lw=2, color='orange'))