VSCODE: jupyter adding interactive matplotlib plot %matplotlib widget not working interactively - matplotlib

The following example doesn't work in VSCODE. It works (with %matplotlib notebook in a Jupyter notebook in a web browser though).
# creating 3d plot using matplotlib
# in python
# for creating a responsive plot
# use %matplotlib widget in VSCODE
#%matplotlib widget
%matplotlib notebook
# importing required libraries
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
# creating random dataset
xs = [14, 24, 43, 47, 54, 66, 74, 89, 12,
44, 1, 2, 3, 4, 5, 9, 8, 7, 6, 5]
ys = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 6, 3,
5, 2, 4, 1, 8, 7, 0, 5]
zs = [9, 6, 3, 5, 2, 4, 1, 8, 7, 0, 1, 2,
3, 4, 5, 6, 7, 8, 9, 0]
# creating figure
fig = plt.figure()
ax = Axes3D(fig)
# creating the plot
plot_geeks = ax.scatter(xs, ys, zs, color='green')
# setting title and labels
ax.set_title("3D plot")
ax.set_xlabel('x-axis')
ax.set_ylabel('y-axis')
ax.set_zlabel('z-axis')
# displaying the plot
plt.show()
The result should be that you get a plot that can be e.g. rotated interactively using the mouse arrow.
In VSCODE one can click on the </> and a renderer is presented. Chosen is JupyterIPWidget Renderer. Other renderers show the plot but don't allow for interactive manipulation.
Also a warning appears:
/var/folders/kc/5p61t70n0llbn05934gj4r_w0000gn/T/ipykernel_22590/1606073246.py:23:
MatplotlibDeprecationWarning: Axes3D(fig) adding itself to the figure is
deprecated since 3.4. Pass the keyword argument auto_add_to_figure=False
and use fig.add_axes(ax) to suppress this warning. The default value of
auto_add_to_figure will change to False in mpl3.5 and True values will
no longer work in 3.6. This is consistent with other Axes classes.
ax = Axes3D(fig)

This behavior is expected-- %matplotlib notebook is not supported in VS Code. You should use %matplotlib widget instead. See https://github.com/microsoft/vscode-jupyter/wiki/Using-%25matplotlib-widget-instead-of-%25matplotlib-notebook,tk,etc

Related

How can I create a legend for my scatter plot which matches the colours used in the plot?

I've created a scatter plot (actually two similar subplots) using matplotlib.pyplot which I'm using for stylometric text analysis. The code I'm using to make the plot is as follows:
import matplotlib.pyplot as plt
import numpy as np
clusters = 4
two_d_matrix = np.array([[0.00617068, -0.53451777], [-0.01837677, -0.47131886], ...])
my_labels = [0, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]
fig, (plot1, plot2) = plt.subplots(1, 2, sharex=False, sharey=False, figsize=(20, 10))
plot1.axhline(0, color='#afafaf')
plot1.axvline(0, color='#afafaf')
for i in range(clusters):
try:
plot1.scatter(two_d_matrix[i:, 0], two_d_matrix[i:, 1], s=30, c=my_labels, cmap='viridis')
except (KeyError, ValueError) as e:
pass
plot1.legend(my_labels)
plot1.set_title("My First Plot")
plot2.axhline(0, color='#afafaf')
plot2.axvline(0, color='#afafaf')
for i in range(clusters):
try:
plot2.scatter(two_d_matrix[i:, 0], two_d_matrix[i:, 1], s=30, c=my_labels, cmap='viridis')
except (KeyError, ValueError) as e:
pass
plot2.legend(my_labels)
plot2.set_title("My Second Plot")
plt.show()
Because there are four distinct values in my_labels there are four colours which appear on the plot, these should correspond to the four clusters I expected to find.
The problem is that the legend only has three values, corresponding to the first three values in my_labels. It also appears that the legend isn't displaying a key for each colour, but for each of the axes and then for one of the colours. This means that the colours appearing in the plot are not matched to what appears in the legend, so the legend is inaccurate. I have no idea why this is happening.
Ideally, the legend should display one colour for each unique value in my_labels, so it should look like this:
How can I get the legend to accurately display all the values it should be showing, i.e. one for each colour which appears in the plot?
Before calling plot1.legend or plot2.legend, you can pass label = None to plot1.axhline or axvline (and similarly to plot2.axhline or plot2.axvline.) This will make sure it doesn't interfere with plotting legends of the scatter points and also not label those lines.
To get labels for all categories of scatter points, you'll have to call plot1.scatter or plot2.scatter by passing the label and choosing only values from two_d_matrix whose index matches with the index of label in my_labels.
You can do it as follows:
import matplotlib.pyplot as plt
import numpy as np
# Generate some (pseudo) random data which is reproducible
generator = np.random.default_rng(seed=121)
matrix = generator.uniform(size=(40, 2))
matrix = np.sort(matrix)
clusters = 4
my_labels = np.array([0, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
fig, ax = plt.subplots(1, 1)
# Select data points wisely
for i in range(clusters):
pos = np.where(my_labels == i)
ax.scatter(matrix[pos, 0], matrix[pos, 1], s=30, cmap='viridis', label=i)
ax.axhline(0, color='#afafaf', label=None)
ax.axvline(0, color='#afafaf', label=None)
ax.legend()
ax.set_title("Expected output")
plt.show()
This gives:
Comparison of current output and expected output
Observe how data points selection (done inside the for loops in the code below) affects the output:
Code:
import matplotlib.pyplot as plt
import numpy as np
# Generate some (pseudo) random data which is reproducible
generator = np.random.default_rng(seed=121)
matrix = generator.uniform(size=(40, 2))
matrix = np.sort(matrix)
clusters = 4
my_labels = np.array([0, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
fig, ax = plt.subplots(1, 2)
# Question plot
for i in range(clusters):
ax[0].scatter(matrix[i:, 0], matrix[i:, 1], s=30, cmap='viridis', label=i)
ax[0].axhline(0, color='#afafaf', label=None)
ax[0].axvline(0, color='#afafaf', label=None)
ax[0].legend()
ax[0].set_title("Current output (with label = None)")
# Answer plot
for i in range(clusters):
pos = np.where(my_labels == i) # <- choose index of data points based on label position in my_labels
ax[1].scatter(matrix[pos, 0], matrix[pos, 1], s=30, cmap='viridis', label=i)
ax[1].axhline(0, color='#afafaf', label=None)
ax[1].axvline(0, color='#afafaf', label=None)
ax[1].legend()
ax[1].set_title("Expected output")
plt.show()

How to reverse colorbar values in matpotlib?

I am using the cbar.ax.tick_params matplotlib command to make a colorbar for an XY scatterplot. How do I reverse the values (not the color-ramp) so that the lowest value is at the top of the bar. This is to represent geological data where the youngest rocks are on top of the older rocks. Here the age is represented by color.
Here is my code:
plt.scatter(summary["d18O"], summary["eHf"], s=150, c = color, cmap = color_map, edgecolors='black', marker='o')
plt.errorbar(summary["d18O"], summary["eHf"], summary["xerr"], summary["yerr"], ls='none', color='lightgrey', zorder=-1)
cbar=plt.colorbar()
cbar.ax.tick_params(labelsize=14)
cbar.minorticks_on()
cbar.set_label('Age (Ma)', style='italic', fontsize=16)
plt.axvline(x=5.3, color='black', zorder=-1)
plt.axhline(y=0, color='black', zorder=-1)
plt.tick_params(labelsize=14)
ax.set_xticks([4, 5, 6, 7, 8, 9, 10, 11, 12, 13])
ax.set_yticks([-6, -4, -2, 0, 2, 4, 6, 8, 10, 12, 14, 16])
plt.ylabel(u'${\epsilon}$Hf$_{T}$', style='italic', fontsize=18)
plt.xlabel(u'$\delta^{18}$O$_{V-SMOW}$ ‰',style='italic', fontsize=18)
plt.text(11.5, 0.3, 'CHUR', fontsize=18)
plt.text(4.9, 5, 'mantle zircon = 5.3‰', fontsize=16, rotation=90)
plt.show()
As #r-beginners mentioned,
cbar.ax.invert_yaxis()
would solve the problem if cbar is your colorer object.

matplotlib histogram with equal bars width

I use a histogram to display the distribution. Everything works fine if the spacing of the bins is uniform. But if the interval is different, then the bar width is appropriate (as expected). Is there a way to set the width of the bar independent of the size of the bins ?
This is what i have
This what i trying to draw
from matplotlib import pyplot as plt
my_bins = [10, 20, 30, 40, 50, 120]
my_data = [5, 5, 6, 8, 9, 15, 25, 27, 33, 45, 46, 48, 49, 111, 113]
fig1 = plt.figure()
ax1 = fig1.add_subplot(121)
ax1.set_xticks(my_bins)
ax1.hist(my_data, my_bins, histtype='bar', rwidth=0.9,)
fig1.show()
I cannot mark your question as a duplicate, but I think my answer to this question might be what you are looking for?
I'm not sure how you'll make sense of the result, but you can use numpy.histogram to calculate the height of your bars, then plot those directly against an arbitrary x-scale.
x = np.random.normal(loc=50, scale=200, size=(2000,))
bins = [0,1,10,20,30,40,50,75,100]
fig = plt.figure()
ax = fig.add_subplot(211)
ax.hist(x, bins=bins, edgecolor='k')
ax = fig.add_subplot(212)
h,e = np.histogram(x, bins=bins)
ax.bar(range(len(bins)-1),h, width=1, edgecolor='k')
EDIT Here's with the adjustment to the x-tick labels so that the correspondence is easier to see.
my_bins = [10, 20, 30, 40, 50, 120]
my_data = [5, 5, 6, 8, 9, 15, 25, 27, 33, 45, 46, 48, 49, 111, 113]
fig = plt.figure()
ax = fig.add_subplot(211)
ax.hist(my_data, bins=my_bins, edgecolor='k')
ax = fig.add_subplot(212)
h,e = np.histogram(my_data, bins=my_bins)
ax.bar(range(len(my_bins)-1),h, width=1, edgecolor='k')
ax.set_xticks(range(len(my_bins)-1))
ax.set_xticklabels(my_bins[:-1])

Pyplot sorting y-values automatically

I have a frequency analysis of words said in episodes of my favorite show. I'm making a plot.barh(s1e1_y, s1e1_x) but it's sorting by words instead of values.
The output of >>> s1e1_y
is
['know', 'go', 'now', 'here', 'gonna', 'can', 'them', 'think', 'come', 'time', 'got', 'elliot', 'talk', 'out', 'night', 'been', 'then', 'need', 'world', "what's"]
and >>>s1e1_x
[42, 30, 26, 25, 24, 22, 20, 19, 19, 18, 18, 18, 17, 17, 15, 15, 14, 14, 13, 13]
When the plots are actually plotted, the graph's y axis ticks are sorted alphabetically even though the plotting list is unsorted...
s1e1_wordlist = []
s1e1_count = []
for word, count in s1e01:
if((word[:-1] in excluded_words) == False):
s1e1_wordlist.append(word[:-1])
s1e1_count.append(int(count))
s1e1_sorted = sorted(list(sorted(zip(s1e1_count, s1e1_wordlist))),
reverse=True)
s1e1_20 = []
for i in range(0,20):
s1e1_20.append(s1e1_sorted[i])
s1e1_x = []
s1e1_y = []
for count, word in s1e1_20:
s1e1_x.append(word)
s1e1_y.append(count)
plot.figure(1, figsize=(20,20))
plot.subplot(341)
plot.title('Season1 : Episode 1')
plot.tick_params(axis='y',labelsize=8)
plot.barh(s1e1_x, s1e1_y)
From matplotlib 2.1 on you can plot categorical variables. This allows to plot plt.bar(["apple","cherry","banana"], [1,2,3]). However in matplotlib 2.1 the output will be sorted by category, hence alphabetically. This was considered as bug and is changed in matplotlib 2.2 (see this PR).
In matplotlib 2.2 the bar plot would hence preserve the order.
In matplotlib 2.1, you would plot the data as numeric data as in any version prior to 2.1. This means to plot the numbers against their index and to set the labels accordingly.
w = ['know', 'go', 'now', 'here', 'gonna', 'can', 'them', 'think', 'come',
'time', 'got', 'elliot', 'talk', 'out', 'night', 'been', 'then', 'need',
'world', "what's"]
n = [42, 30, 26, 25, 24, 22, 20, 19, 19, 18, 18, 18, 17, 17, 15, 15, 14, 14, 13, 13]
import matplotlib.pyplot as plt
import numpy as np
plt.barh(range(len(w)),n)
plt.yticks(range(len(w)),w)
plt.show()
Ok you seem to have a lot of spurious code in your example which isn't relevant to the problem as you've described it but assuming you don't want the y axis to sort alphabetically then you need to zip your two lists into a dataframe then plot the dataframe as follows
df = pd.DataFrame(list(zip(s1e1_y,s1e1_x))).set_index(1)
df.plot.barh()
This then produces the following

Clarification about flatten function in Theano

in [http://deeplearning.net/tutorial/lenet.html#lenet] it says:
This will generate a matrix of shape (batch_size, nkerns[1] * 4 * 4),
# or (500, 50 * 4 * 4) = (500, 800) with the default values.
layer2_input = layer1.output.flatten(2)
when I use flatten function on a numpy 3d array I get a 1D array. but here it says I get a matrix. How does flatten(2) work in theano?
A similar example on numpy produces 1D array:
a= array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[19, 20, 21],
[22, 23, 24],
[25, 26, 27]]])
a.flatten(2)=array([ 1, 10, 19, 4, 13, 22, 7, 16, 25, 2, 11, 20, 5, 14, 23, 8, 17,
26, 3, 12, 21, 6, 15, 24, 9, 18, 27])
numpy doesn't support flattening only some dimensions but Theano does.
So if a is a numpy array, a.flatten(2) doesn't make any sense. It runs without error but only because the 2 is passed as the order parameter which seems to cause numpy to stick with the default order of C.
Theano's flatten does support axis specification. The documentation explains how it works.
Parameters:
x (any TensorVariable (or compatible)) – variable to be flattened
outdim (int) – the number of dimensions in the returned variable
Return type:
variable with same dtype as x and outdim dimensions
Returns:
variable with the same shape as x in the leading outdim-1 dimensions,
but with all remaining dimensions of x collapsed into the last dimension.
For example, if we flatten a tensor of shape (2, 3, 4, 5) with
flatten(x, outdim=2), then we’ll have the same (2-1=1) leading
dimensions (2,), and the remaining dimensions are collapsed. So the
output in this example would have shape (2, 60).
A simple Theano demonstration:
import numpy
import theano
import theano.tensor as tt
def compile():
x = tt.tensor3()
return theano.function([x], x.flatten(2))
def main():
a = numpy.arange(2 * 3 * 4).reshape((2, 3, 4))
f = compile()
print a.shape, f(a).shape
main()
prints
(2L, 3L, 4L) (2L, 12L)