Pandas in PyCharm: Where does it display the boxplot? - pandas

I am making a python script using the PyCharm IDE, and the idea is to display descriptive statistics and a box plot for each group in a DataFrame. The statistics displays, but the boxplot is nowhere to be seen...
I have tried Googling an answer, but it does not seem this question have been answered before.
import pandas as pd
import matplotlib as plt
(...)
for name, group in grouped:
if len(group) > 3:
print("\n\nNAME: {}".format(name))
print("GROUP: {}".format(group))
print("DESCRIPTIVE STATISTICS
{}".format(group.distance2.describe()))
print(group.distance2.plot.box())
group.distance2.plot.box()
I do not get any error messages, the code runs and completes, but I do not know where the boxplot is supposed to display.

I think the code as it is does not create a matplotlib figure object. Try creating a test data object for group.distance2, then create a matplotlib boxplot object. I am assuming you are using the matplotlib library.
import matplotlib.pyplot as plt
for name, group in grouped:
if len(group) > 3:
data = group.distance2
# create a matplotlib figure object
fig, axs = plt.subplots(1, 1)
# basic plot
axs[0, 0].boxplot(data)
axs[0, 0].set_title('basic plot of group.distance2')
plt.show()
It that works, you can try putting several group data into one figure (axes). Here is more information: https://matplotlib.org/3.1.0/gallery/statistics/boxplot_demo.html

Related

how to make plt plot do not show figure [duplicate]

I need to create a figure in a file without displaying it within IPython notebook. I am not clear on the interaction between IPython and matplotlib.pylab in this regard. But, when I call pylab.savefig("test.png") the current figure get's displayed in addition to being saved in test.png. When automating the creation of a large set of plot files, this is often undesirable. Or in the situation that an intermediate file for external processing by another app is desired.
Not sure if this is a matplotlib or IPython notebook question.
This is a matplotlib question, and you can get around this by using a backend that doesn't display to the user, e.g. 'Agg':
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
plt.plot([1,2,3])
plt.savefig('/tmp/test.png')
EDIT: If you don't want to lose the ability to display plots, turn off Interactive Mode, and only call plt.show() when you are ready to display the plots:
import matplotlib.pyplot as plt
# Turn interactive plotting off
plt.ioff()
# Create a new figure, plot into it, then close it so it never gets displayed
fig = plt.figure()
plt.plot([1,2,3])
plt.savefig('/tmp/test0.png')
plt.close(fig)
# Create a new figure, plot into it, then don't close it so it does get displayed
plt.figure()
plt.plot([1,3,2])
plt.savefig('/tmp/test1.png')
# Display all "open" (non-closed) figures
plt.show()
We don't need to plt.ioff() or plt.show() (if we use %matplotlib inline). You can test above code without plt.ioff(). plt.close() has the essential role. Try this one:
%matplotlib inline
import pylab as plt
# It doesn't matter you add line below. You can even replace it by 'plt.ion()', but you will see no changes.
## plt.ioff()
# Create a new figure, plot into it, then close it so it never gets displayed
fig = plt.figure()
plt.plot([1,2,3])
plt.savefig('test0.png')
plt.close(fig)
# Create a new figure, plot into it, then don't close it so it does get displayed
fig2 = plt.figure()
plt.plot([1,3,2])
plt.savefig('test1.png')
If you run this code in iPython, it will display a second plot, and if you add plt.close(fig2) to the end of it, you will see nothing.
In conclusion, if you close figure by plt.close(fig), it won't be displayed.

how to prevent seaborn to skip year in xtick label in Timeseries Plot

I have included the screenshot of the plot. Is there a way to prevent seaborn from skipping the xtick labels in timeseries data.
Most seaborn functions return a matplotlib object, so you can control the number of major ticks displayed via matplotlib. By default, matplotlib will auto-scale, which is why it hides some year labels, you can try to set the MaxNLocator.
Consider the following example:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# load data
df = sns.load_dataset('flights')
df.drop_duplicates('year', inplace=True)
df.year = df.year.astype('str')
# plot
fig, ax = plt.subplots(figsize=(5, 2))
sns.lineplot(x='year', y='passengers', data=df, ax=ax)
ax.xaxis.set_major_locator(plt.MaxNLocator(5))
This gives you:
ax.xaxis.set_major_locator(plt.MaxNLocator(10))
will give you
Agree with answer of #steven, just want to say that methods for xticks like plt.xticks or ax.xaxis.set_ticks seem more natural to me. Full details can be found here.

Matplotlib graph does not show in Python Interactive Window

The following is my code, but I can't get the plot to show on my Visual Studio Code even though I am running this on the Python Interactive Window, which should usually show a graph plot after running. The tables are showing just fine. I also do not get a default graph which pops up like it normally should. What am I doing wrong?
import yfinance as yf
import pandas as pd
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
import talib
df = pd.read_csv('filename.csv')
df = df.sort_index()
macd, macdsignal, macdhist = talib.MACD(df['Close'], fastperiod=12, slowperiod=26, signalperiod=9)
macd = macd.to_list()
macdsignal = macdsignal.to_list()
macdhist = macdhist.to_list()
df['macd'], df['macdsignal'], df['macdhist'] = macd,macdsignal,macdhist
ax = plt.gca()
print(df.columns)
df.plot(kind='line',x='Date',y='macd', color='blue',ax=ax)
df.plot(kind='line',x='Date',y='macdsignal', color='red', ax=ax)
plt.show()
The csv file has data that looks like this
The issue was with matplotlib.use('agg'), which does not support the show() function. This prevented the graph from being displayed on Visual Studio's Interactive Window. The matplotlib.use('agg') method can, however, be used for saving your graph in a .png format.
According to Matplotlib.org, agg is "the canonical renderer for user interfaces, which uses the Anti-Grain Geometry C++ library to make a raster (pixel) image of the figure". More information can be found at this link here

Change y-axis scaling fontsize in pandas dataframe.plot()

I am changing the font-sizes in my python pandas dataframe plot. The only part that I could not change is the scaling of y-axis values (see the figure below).
Could you please help me with that?
Added:
Here is the simplest code to reproduce my problem:
import pandas as pd
start = 10**12
finish = 1.1*10**12
y = np.linspace(start , finish)
pd.DataFrame(y).plot()
plt.tick_params(axis='x', labelsize=17)
plt.tick_params(axis='y', labelsize=17)
You will see that this result in the graph similar to above. No change in the scaling of the y-axis.
Ma
There are just so many features that you can control with the plotting capabilities of pandas, which leverages matplotlib. I found that seaborn is a lot easier to produce pretty charts, and you have a lot more control over the parameters of your plots.
This is not the most elegant solution, but it works; however, it has a seborn dependency:
%pylab inline
import pandas as pd
import seaborn as sns
import numpy as np
sns.set(style="darkgrid")
sns.set(font_scale=1.5)
start = 10**12
finish = 1.1*10**12
y = np.linspace(start , finish)
pd.DataFrame(y).plot()
plt.tick_params(axis='x', labelsize=17)
plt.tick_params(axis='y', labelsize=17)
I use Jupyter Notebook an that's why I use %pylab inline. The key element here is the use of
font_scale=1.5
Which you can set to whatver you want that produces your desired result. This is what I get:

Calling pylab.savefig without display in ipython

I need to create a figure in a file without displaying it within IPython notebook. I am not clear on the interaction between IPython and matplotlib.pylab in this regard. But, when I call pylab.savefig("test.png") the current figure get's displayed in addition to being saved in test.png. When automating the creation of a large set of plot files, this is often undesirable. Or in the situation that an intermediate file for external processing by another app is desired.
Not sure if this is a matplotlib or IPython notebook question.
This is a matplotlib question, and you can get around this by using a backend that doesn't display to the user, e.g. 'Agg':
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
plt.plot([1,2,3])
plt.savefig('/tmp/test.png')
EDIT: If you don't want to lose the ability to display plots, turn off Interactive Mode, and only call plt.show() when you are ready to display the plots:
import matplotlib.pyplot as plt
# Turn interactive plotting off
plt.ioff()
# Create a new figure, plot into it, then close it so it never gets displayed
fig = plt.figure()
plt.plot([1,2,3])
plt.savefig('/tmp/test0.png')
plt.close(fig)
# Create a new figure, plot into it, then don't close it so it does get displayed
plt.figure()
plt.plot([1,3,2])
plt.savefig('/tmp/test1.png')
# Display all "open" (non-closed) figures
plt.show()
We don't need to plt.ioff() or plt.show() (if we use %matplotlib inline). You can test above code without plt.ioff(). plt.close() has the essential role. Try this one:
%matplotlib inline
import pylab as plt
# It doesn't matter you add line below. You can even replace it by 'plt.ion()', but you will see no changes.
## plt.ioff()
# Create a new figure, plot into it, then close it so it never gets displayed
fig = plt.figure()
plt.plot([1,2,3])
plt.savefig('test0.png')
plt.close(fig)
# Create a new figure, plot into it, then don't close it so it does get displayed
fig2 = plt.figure()
plt.plot([1,3,2])
plt.savefig('test1.png')
If you run this code in iPython, it will display a second plot, and if you add plt.close(fig2) to the end of it, you will see nothing.
In conclusion, if you close figure by plt.close(fig), it won't be displayed.