Matplotlib and numpy histogram2d axis issue - numpy

I'm struggling to get the axis right:
I've got the x and y values, and want to plot them in a 2d histogram (to examine correlation). Why do I get a histogram with limits from 0-9 on each axis? How do I get it to show the actual value ranges?
This is a minimal example and I would expect to see the red "star" at (3, 3):
import numpy as np
import matplotlib.pyplot as plt
x = (1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 3)
y = (1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 3)
xedges = range(5)
yedges = range(5)
H, xedges, yedges = np.histogram2d(y, x)
im = plt.imshow(H, origin='low')
plt.show()

I think the problem is twofold:
Firstly you should have 5 bins in your histogram (it's set to 10 as default):
H, xedges, yedges = np.histogram2d(y, x,bins=5)
Secondly, to set the axis values, you can use the extent parameter, as per the histogram2d man pages:
im = plt.imshow(H, interpolation=None, origin='low',
extent=[xedges[0], xedges[-1], yedges[0], yedges[-1]])

If I understand correctly, you just need to set interpolation='none'
import numpy as np
import matplotlib.pyplot as plt
x = (1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 3)
y = (1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 3)
xedges = range(5)
yedges = range(5)
H, xedges, yedges = np.histogram2d(y, x)
im = plt.imshow(H, origin='low', interpolation='none')
Does that look right?

Related

Numpy.polyfit Not Returning Polynomial

I am trying to create a python program in which the user inputs a set of data and the program spits out an output in which it creates a graph with a line/polynomial which best fits the data.
This is the code:
from matplotlib import pyplot as plt
import numpy as np
x = []
y = []
x_num = 0
while True:
sequence = int(input("Input 1 number in the sequence, type 9040321 to stop"))
if sequence == 9040321:
poly = np.polyfit(x, y, deg=2, rcond=None, full=False, w=None, cov=False)
plt.plot(poly)
plt.scatter(x, y, c="blue", label="data")
plt.legend()
plt.show()
break
else:
y.append(sequence)
x.append(x_num)
x_num += 1
I used the polynomial where I inputed 1, 2, 4, 8 each in separate inputs. MatPlotLib graphed it properly, however, for the degree of 2, the output was the following image:
This is clearly not correct, however I am unsure what the problem is. I think it has something to do with the degree, however when I change the degree to 3, it still does not fit. I am looking for a graph like y=sqrt(x) to go over each of the points and when that is not possible, create the line that fits the best.
Edit: I added a print(poly) feature and for the selected input above, it gives [0.75 0.05 1.05]. I do not know what to make of this.
Approximation by a second degree polynomial
np.polyfit gives the coefficients of a polynomial close to the given points. To plot the polynomial as a smooth curve with matplotlib, you need to calculate a lot of x,y pairs. Using np.linspace(start, stop, numsteps) for the xs, numpy's vectorization allows calculating all the corresponding ys in one go. E.g. ys = a * x**2 + b * x + c.
from matplotlib import pyplot as plt
import numpy as np
x = [0, 1, 2, 3, 4, 5, 6]
y = [1, 2, 4, 8, 16, 32, 64]
plt.scatter(x, y, color='crimson', label='given points')
poly = np.polyfit(x, y, deg=2, rcond=None, full=False, w=None, cov=False)
xs = np.linspace(min(x), max(x), 100)
ys = poly[0] * xs ** 2 + poly[1] * xs + poly[2]
plt.plot(xs, ys, color='dodgerblue', label=f'$({poly[0]:.2f})x^2+({poly[1]:.2f})x + ({poly[2]:.2f})$')
plt.legend()
plt.show()
Higher degree approximating polynomials
Given N points, an N-1 degree polynomial can pass exactly through each of them. Here is an example with 7 points and polynomials of up to degree 6,
from matplotlib import pyplot as plt
import numpy as np
x = [0, 1, 2, 3, 4, 5, 6]
y = [1, 2, 4, 8, 16, 32, 64]
plt.scatter(x, y, color='black', zorder=3, label='given points')
for degree in range(0, len(x)):
poly = np.polyfit(x, y, deg=degree, rcond=None, full=False, w=None, cov=False)
xs = np.linspace(min(x) - 0.5, max(x) + 0.5, 100)
ys = sum(poly_i * xs**i for i, poly_i in enumerate(poly[::-1]))
plt.plot(xs, ys, label=f'degree {degree}')
plt.legend()
plt.show()
Another example
x = [0, 1, 2, 3, 4]
y = [1, 1, 6, 5, 5]
import numpy as np
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [1, 2, 4, 8]
coeffs = np.polyfit(x, y, 2)
print(coeffs)
poly = np.poly1d(coeffs)
print(poly)
x_cont = np.linspace(0, 4, 81)
y_cont = poly(x_cont)
plt.scatter(x, y)
plt.plot(x_cont, y_cont)
plt.grid(1)
plt.show()
Executing the code, you have the graph above and this is printed in the terminal:
[ 0.75 -1.45 1.75]
2
0.75 x - 1.45 x + 1.75
It seems to me that you had false expectations about the output of polyfit.

Cublic spline interpolation produces straight lines

I would like to obtain a smooth curve going through specific points with integer coordinates. Instead of that I get straight line segments between the points. I tried interp1d(x,y,kind='cubic') and also CubicSpline, nothing works. Here is my code:
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import interp1d,CubicSpline
x = np.arange(34)
y = [8,3,0,1,6,2,1,7,6,2,0,2,6,0,1,6,2,2,0,2,7,0,2,8,6,3,6,2,0,1,6,2,7,2]
f = CubicSpline(x, y)
plt.figure(figsize=(10,3))
plt.plot(x, y, 'o', x, f(x))
plt.show()
and here is the result:
Can you tell me how to get smooth curves instead?
Now you are using the original x-values to draw the curve. You need a new array with much more intermediate x-values. Numpy's np.linspace() creates such an array between a given minimum and maximum.
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import interp1d, CubicSpline
y = [8, 3, 0, 1, 6, 2, 1, 7, 6, 2, 0, 2, 6, 0, 1, 6, 2, 2, 0, 2, 7, 0, 2, 8, 6, 3, 6, 2, 0, 1, 6, 2, 7, 2]
x = np.arange(len(y))
f = CubicSpline(x, y)
plt.figure(figsize=(10, 3))
xs = np.linspace(x.min(), x.max(), 500)
plt.plot(x, y, 'o', xs, f(xs))
plt.tight_layout()
plt.show()

How to plot a tuple as x axis and a list on y axis

Suppose I have a df in the following form
import pandas as pd
import numpy as np
import matplotlib as plt
import matplotlib.pyplot as plt
col1
(0, 0, 0, 0) 1
(0, 0, 0, 2) 2
(0, 0, 2, 2) 3
(0, 2, 2, 2) 4
I want to plot my index in x axis and col1 in y axis.
What I tried
plt.plot(list(df.index), df['col1'])
However, it generates a plot that is not what I am looking for.
If you give a list of 4-tuples as x for plt.plot(), they are interpreted as 4 line plots, one with the first elements from the tuples, one with the second elements, etc.
You can convert the tuples to strings to show them as such:
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({'y': [1, 2, 3, 4]}, index=[(0, 0, 0, 0), (0, 0, 0, 2), (0, 0, 2, 2), (0, 2, 2, 2)])
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(12, 3))
ax1.plot(list(df.index), df['y'])
ax2.plot([str(i) for i in df.index], df['y'])
plt.show()

plotting 2 dictionaries in matplotlib

I have 2 dictionaries: dict1 = {'Beef':10, 'Poultry': 13, 'Pork': 14, 'Lamb': 11} and dict2 = {'Beef':3, 'Poultry': 1, 'Pork': 17, 'Lamb': 16}
I want to plot a double bar chart using the dictionary keys as the x-axis values, and the associated values on the y-axis. I am using matplotlib for this. does anyone have any information?
This part of the matplotlib documentation may what you are looking for. To plot your data, the x and y values need to be extracted from the dicts, for example via dict.keys() and dict.values().
import matplotlib.pyplot as plt
import numpy as np
dict1 = {'Beef':10, 'Poultry': 13, 'Pork': 14, 'Lamb': 11}
dict2 = {'Beef':3, 'Poultry': 1, 'Pork': 17, 'Lamb': 16}
x = dict1.keys()
y1 = dict1.values()
y2 = dict2.values()
N = len(x)
fig, ax = plt.subplots()
ind = np.arange(N) # the x locations for the groups
width = 0.35 # the width of the bars
p1 = ax.bar(ind, y1, width)
p2 = ax.bar(ind + width, y2, width)
ax.set_xticks(ind + width / 2)
ax.set_xticklabels(x)
ax.legend((p1[0], p2[0]), ('dict1', 'dict2'))
plt.show()
Result:
I'd like to propose a more general approach: instead of just two dicts, what happens if we have a list of dictionaries?
In [89]: from random import randint, seed, shuffle
...: seed(20201213)
...: cats = 'a b c d e f g h i'.split() # categories
...: # List Of Dictionaries
...: lod = [{k:randint(5, 15) for k in shuffle(cats) or cats[:-2]} for _ in range(5)]
...: lod
Out[89]:
[{'d': 14, 'h': 10, 'i': 13, 'f': 13, 'c': 5, 'b': 5, 'a': 14},
{'h': 12, 'd': 5, 'c': 5, 'i': 11, 'b': 14, 'g': 8, 'e': 13},
{'d': 8, 'a': 12, 'f': 7, 'h': 10, 'g': 10, 'c': 11, 'i': 12},
{'g': 11, 'f': 8, 'i': 14, 'h': 11, 'a': 5, 'c': 7, 'b': 8},
{'e': 11, 'h': 13, 'c': 5, 'i': 8, 'd': 12, 'a': 11, 'g': 11}]
As you can see, the keys are not ordered in the same way and the dictionaries do not contain all the possible keys...
Our first step is to find a list of keys (lok), using a set comprehension, followed by sorting the keys (yes, we already know the keys, but here we are looking for a general solution…)
In [90]: lok = sorted(set(k for d in lod for k in d))
The number of elements in the two lists are
In [91]: nk, nd = len(lok), len(lod)
At this point we can compute the width of a single bar, saying that the bar groups are 1 unit apart (hence x = range(nk)) and that we leave 1/3 unit between the groups, we have
In [92]: x, w = range(nk), 0.67/nd
We are ready to go with the plot
In [93]: import matplotlib.pyplot as plt
...: for n, d in enumerate(lod):
...: plt.bar([ξ+n*w for ξ in x], [d.get(k, 0) for k in lok], w,
...: label='dict %d'%(n+1))
...: plt.xticks([ξ+w*nd/2 for ξ in x], lok)
...: plt.legend();
Let's write a small function
def plot_lod(lod, ws=0.33, ax=None, legend=True):
"""bar plot from the values in a list of dictionaries.
lod: list of dictionaries,
ws: optional, white space between groups of bars as a fraction of unity,
ax: optional, the Axes object to draw into,
legend: are we going to draw a legend?
Return: the Axes used to plot and a list of BarContainer objects."""
from matplotlib.pyplot import subplot
from numpy import arange, nan
if ax is None : ax = subplot()
lok = sorted({k for d in lod for k in d})
nk, nd = len(lok), len(lod)
x, w = arange(nk), (1.0-ws)/nd
lobars = [
ax.bar(x+n*w, [d.get(k, nan) for k in lok], w, label='%02d'%(n+1))
for n, d in enumerate(lod)
]
ax.set_xticks(x+w*nd/2-w/2)
ax.set_xticklabels(lok)
if legend : ax.legend()
return ax, lobars
Using the data of the previous example, we get a slightly different graph…

How to flip y axis in a bar3d() plot?

I use bar3d() to plot a 3D barchart, and I'd like to flip the y axis. I've tried to use invert_yaxis(), but it seems effectless. I've also tried manually reverse the values in the list with [::-1], but it didn't help either. It keeps displaying the 3D barchart in the very same way.
Any idea how can I flip the y axis?
Here's an example how it's not working for me (not even with 3D line plots):
from matplotlib.pyplot import *
from mpl_toolkits.mplot3d.axes3d import Axes3D
fig1 = figure(1)
ax11 = subplot(2, 2, 1, projection='3d')
ax11.plot([1, 2, 3, 4], [1, 2, 3, 4])
ax12 = subplot(2, 2, 2, projection='3d')
ax12.invert_xaxis()
ax12.plot([1, 2, 3, 4], [1, 2, 3, 4])
ax21 = subplot(2, 2, 3)
ax21.plot([1, 2, 3, 4])
ax22 = subplot(2, 2, 4)
ax22.invert_xaxis()
ax22.plot([1, 2, 3, 4])
show()
And the plot looks like this: http://we.tl/cqSsecVy6P
Thanks,
Daniel
If I understand the question correctly I think the problem is that matplotlib rotates the 3D plot. To remedy this just set the initial viewing angle using ax.view_init(elev, azim). Taking the matplotlib hist3d demo then we just have
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x, y = np.random.rand(2, 100) * 4
hist, xedges, yedges = np.histogram2d(x, y, bins=4)
elements = (len(xedges) - 1) * (len(yedges) - 1)
xpos, ypos = np.meshgrid(xedges[:-1]+0.25, yedges[:-1]+0.25)
xpos = xpos.flatten()
ypos = ypos.flatten()
zpos = np.zeros(elements)
dx = 0.5 * np.ones_like(zpos)
dy = dx.copy()
dz = hist.flatten()
ypos_inv = ypos
ax.bar3d(xpos, ypos, zpos, dx, dy, dz, color='b', zsort='average')
ax.view_init(ax.elev, ax.azim+90)
plt.show()
Here I have rotated the axis by 90 degrees which flips one of the axis but not the other.