matplotlib adding string to a an axis - matplotlib

import matplotlib.pyplot as plt
import numpy as np
ydata = [55,60,65,70,75,80]
xdata = [1,2,3,4,5,6]
plt.plot(xdata, ydata)
set(plt.gca,'XTickLabel',{'Jan','Feb','Mar','April','May','June'})
plt.show()
I am using matplotlib and trying to add text values to appear on the x axis.
I have tried to use the following code but get the following error message
set(plt.gca,'XTickLabel',
{'Jan','Feb','Mar','April','May','June'})
TypeError: set expected at most 1 arguments, got 3 I am not sure what this
is referring get current access I have set the value

Sets are a Python data structure, it has nothing to do with what you want here, you only need to use ax.set_xticklabels and ax.set_xticks to ensure all of them show in the plot:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
ydata = [55,60,65,70,75,80]
xdata = [1,2,3,4,5,6]
ax.set_xticks(xdata)
ax.set_xticklabels(['Jan','Feb','Mar','April','May','June'])
plt.plot(xdata, ydata)
plt.show()

Related

Cannot set ylim for sklearn partial dependence plot

I am following this example on sklearn documentation
I want to change the limits of y axis so I can visually compare results from different models.
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_diabetes
from sklearn.tree import DecisionTreeRegressor
from sklearn.inspection import PartialDependenceDisplay
diabetes = load_diabetes()
X = pd.DataFrame(diabetes.data, columns=diabetes.feature_names)
y = diabetes.target
tree = DecisionTreeRegressor()
tree.fit(X, y)
fig, ax = plt.subplots(figsize=(12, 6))
ax.set_ylim(50,300)
tree_disp = PartialDependenceDisplay.from_estimator(tree, X, ["age", "bmi"], ax=ax)
However, it seems that ax.set_ylim get ignored no matter what I specify. On the other hand, ax.set_title given in example works fine.
PartialDependenceDisplay have an axes_ attribute that represents both matplotlib's axes of the figure.
You can modify them as follow:
tree_disp = PartialDependenceDisplay.from_estimator(tree, X, ["age", "bmi"], ax=ax)
tree_disp.axes_[0][0].set_ylim(50,300)
tree_disp.axes_[0][1].set_ylim(50,300)
This will output the following plot:

Understanding plt.show() in Matplotlib

import numpy as np
import os.path
from skimage.io import imread
from skimage import data_dir
img = imread(os.path.join(data_dir, 'checker_bilevel.png'))
import matplotlib.pyplot as plt
#plt.imshow(img, cmap='Blues')
#plt.show()
imgT = img.T
plt.figure(1)
plt.imshow(imgT,cmap='Greys')
#plt.show()
imgR = img.reshape(20,5)
plt.figure(2)
plt.imshow(imgR,cmap='Blues')
plt.show(1)
I read that plt.figure() will create or assign the image a new ID if not explicitly given one. So here, I have given the two figures, ID 1 & 2 respectively. Now I wish to see only one one of the image.
I tried plt.show(1) epecting ONLY the first image will be displayed but both of them are.
What should I write to get only one?
plt.clf() will clear the figure
import matplotlib.pyplot as plt
plt.plot(range(10), 'r')
plt.clf()
plt.plot(range(12), 'g--')
plt.show()
plt.show will show all the figures created. The argument you forces the figure to be shown in a non-blocking way. If you only want to show a particular figure you can write a wrapper function.
import matplotlib.pyplot as plt
figures = [plt.subplots() for i in range(5)]
def show(figNum, figures):
if plt.fignum_exists(figNum):
fig = [f[0] for f in figures if f[0].number == figNum][0]
fig.show()
else:
print('figure not found')

LogFormatter tickmarks scientific format limits

I'm trying to plot over a wide range with a log-scaled axis, but I want to show 10^{-1}, 10^0, 10^1 as just 0.1, 1, 10. ScalarFormatter will change everything to integers instead of scientific notation, but I'd like most of the tickmark labels to be scientific; I'm only wanting to change a few of the labels. So the MWE is
import numpy as np
import matplotlib as plt
fig = plt.figure(figsize=[7,7])
ax1 = fig.add_subplot(111)
ax1.set_yscale('log')
ax1.set_xscale('log')
ax1.plot(np.logspace(-4,4), np.logspace(-4,4))
plt.show()
and I want the middle labels on each axis to read 0.1, 1, 10 instead of 10^{-1}, 10^0, 10^1
Thanks for any help!
When setting set_xscale('log'), you're using a LogFormatterSciNotation (not a ScalarFormatter). You may subclass LogFormatterSciNotation to return the desired values 0.1,1,10 if they happen to be marked as ticks.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import LogFormatterSciNotation
class CustomTicker(LogFormatterSciNotation):
def __call__(self, x, pos=None):
if x not in [0.1,1,10]:
return LogFormatterSciNotation.__call__(self,x, pos=None)
else:
return "{x:g}".format(x=x)
fig = plt.figure(figsize=[7,7])
ax = fig.add_subplot(111)
ax.set_yscale('log')
ax.set_xscale('log')
ax.plot(np.logspace(-4,4), np.logspace(-4,4))
ax.xaxis.set_major_formatter(CustomTicker())
plt.show()
Update: With matplotlib 2.1 there is now a new option
Specify minimum value to format as scalar for LogFormatterMathtext
LogFormatterMathtext now includes the option to specify a minimum value exponent to format as a scalar (i.e., 0.001 instead of 10-3).
This can be done as follows, by using the rcParams (plt.rcParams['axes.formatter.min_exponent'] = 2):
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['axes.formatter.min_exponent'] = 2
fig = plt.figure(figsize=[7,7])
ax = fig.add_subplot(111)
ax.set_yscale('log')
ax.set_xscale('log')
ax.plot(np.logspace(-4,4), np.logspace(-4,4))
plt.show()
This results in the same plot as above.
Note however that this limit is symmetric, it would not allow to set only 1 and 10, but not 0.1. Hence the initial solution is more generic.

Annotate labels in pandas scatter plot

I saw this method from an older post but can't get the plot I want.
To start
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import string
df = pd.DataFrame({'x':np.random.rand(10),'y':np.random.rand(10)},
index=list(string.ascii_lowercase[:10]))
scatter plot
ax = df.plot('x','y', kind='scatter', s=50)
Then define a function to iterate the rows to annotate
def annotate_df(row):
ax.annotate(row.name, row.values,
xytext=(10,-5),
textcoords='offset points',
size=18,
color='darkslategrey')
Last apply to get annotation
ab= df.apply(annotate_df, axis=1)
Somehow I just get a series ab instead of the scatter plot I want. Where is wrong? Thank you!
Your code works, you just need plt.show() at the end.
Your full code:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import string
df = pd.DataFrame({'x':np.random.rand(10),'y':np.random.rand(10)},
index=list(string.ascii_lowercase[:10]))
ax = df.plot('x','y', kind='scatter', s=50)
def annotate_df(row):
ax.annotate(row.name, row.values,
xytext=(10,-5),
textcoords='offset points',
size=18,
color='darkslategrey')
ab= df.apply(annotate_df, axis=1)
plt.show()
Looks like that this doesn't work any more, however the solution is easy: convert row.values from numpy.ndarray to list:
list(row.values)

My pandas-generated subplots are layouted incorrectly

I ran the following code to get two plots next to each other (it is a minimal working example that you can copy):
import pandas as pd
import numpy as np
from matplotlib.pylab import plt
comp1 = np.random.normal(0,1,size=200)
values = pd.Series(comp1)
plt.close("all")
f = plt.figure()
plt.show()
sp1 = f.add_subplot(2,2,1)
values.hist(bins=100, alpha=0.5, color="r", normed=True)
sp2 = f.add_subplot(2,2,2)
values.plot(kind="kde")
Unfortunately, I then get the following image:
This is also an interesting layout, but I wanted the figures to be next to each other. What did I do wrong? How can I correct it?
For clarity, I could also use this:
import pandas as pd
import numpy as np
from matplotlib.pylab import plt
comp1 = np.random.normal(0,1,size=200)
values = pd.Series(comp1)
plt.close("all")
fig, axes = plt.subplots(2,2)
plt.show()
axes[0,0].hist(values, bins=100, alpha=0.5, color="r", normed=True) # Until here, it works. You get a half-finished correct image of what I was going for (though it is 2x2 here)
axes[0,1].plot(values, kind="kde") # This does not work
Unfortunately, in this approach axes[0,1] refers to the subplot that has a plot method but does not know kind="kde". Please take into consideration that the in the first version plot is executed on the pandas object, whereas in the second version plot is executed on the subplot, which does not work with the kind="kde" parameter.
use ax= argument to set which subplot object to plot:
import pandas as pd
import numpy as np
from matplotlib.pylab import plt
comp1 = np.random.normal(0,1,size=200)
values = pd.Series(comp1)
plt.close("all")
f = plt.figure()
sp1 = f.add_subplot(2,2,1)
values.hist(bins=100, alpha=0.5, color="r", normed=True, ax=sp1)
sp2 = f.add_subplot(2,2,2)
values.plot(kind="kde", ax=sp2)