histogram2d heatmap manipulation - numpy

I created a heatmap from a scatterplot of csv values using the code i found from a different stackoverflow thread here Generate a heatmap in MatPlotLib using a scatter data set
This works but I'd like to edit the colours/smooth between bins etc. I've read this https://matplotlib.org/examples/color/colormaps_reference.html ...but my level of n00b is preventing swift progress. Does my current code seem ameanable to easy manipulation for interpolation between bins (smoothing) or at least a colour change, or do I need to create my heatmap in a different way to gain more control? (the heatmap will represent how often a space is used in time, based on x y values of a tracked item)
Thanks , any help much appreciated.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
import csv
with open('myfile.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
y = []
x = []
for row in readCSV:
x.append(float(row [0]))
y.append(float(row [1]))
print (x, y)
heatmap, xedges, yedges = np.histogram2d(x,y,bins=20)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.clf()
plt.imshow(heatmap.T, extent=extent)
plt.show()

Related

Data visualization using Matplotlib

By using this code I'm able to generate 20 data points on y-axis corresponding to x-axis, but I want to mark the 25 data points on the line as downward pointed triangles without changing arr_x=np.linspace(0.0,5.0,20) to arr_x=np.linspace(0.0,5.0,25).
will it possible to mark additional data points on y-axis without changing x-axis ?
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
def multi_curve_plot():
# Write your functionality below
fig=plt.figure(figsize=(13,4))
ax=fig.add_subplot(111)
arr_x=np.linspace(0.0,5.0,20)
arr_y1=np.array(arr_x)
arr_y2=np.array(arr_x**2)
arr_y3=np.array(arr_x**3)
ax.set(title="Linear, Quadratic, & Cubic Equations", xlabel="arr_X",
ylabel="f(arr_X)")
ax.plot(arr_x, arr_y1, label="y = arr_x", color="green", marker="v")
ax.plot(arr_x, arr_y2, label ="y = arr_x**2", color ="blue", marker="s")
ax.plot(arr_x, arr_y3, label="y = arr_x**3", color="red", marker="o")
plt.legend()
return fig
return None
multi_curve_plot()
I tried changing arr_x=np.linspace(0.0,5.0,20) to arr_x=np.linspace(0.0,5.0,25). But I want to show 25 data points on y axis without changing x-axis attributes.

changing the size of subplots with matplotlib

I am trying to plot multiple rgb images with matplotlib
the code I am using is:
import numpy as np
import matplotlib.pyplot as plt
for i in range(0, images):
test = np.random.rand(1080, 720,3)
plt.subplot(images,2,i+1)
plt.imshow(test, interpolation='none')
the subplots appear tiny though as thumbnails
How can I make them bigger?
I have seen solutions using
fig, ax = plt.subplots()
syntax before but not with plt.subplot ?
plt.subplots initiates a subplot grid, while plt.subplot adds a subplot. So the difference is whether you want to initiate you plot right away or fill it over time. Since it seems, that you know how many images to plot beforehand, I would also recommend going with subplots.
Also notice, that the way you use plt.subplot you generate empy subplots in between the ones you are actually using, which is another reason they are so small.
import numpy as np
import matplotlib.pyplot as plt
images = 4
fig, axes = plt.subplots(images, 1, # Puts subplots in the axes variable
figsize=(4, 10), # Use figsize to set the size of the whole plot
dpi=200, # Further refine size with dpi setting
tight_layout=True) # Makes enough room between plots for labels
for i, ax in enumerate(axes):
y = np.random.randn(512, 512)
ax.imshow(y)
ax.set_title(str(i), fontweight='bold')

Customize the axis label in seaborn jointplot

I seem to have got stuck at a relatively simple problem but couldn't fix it after searching for last hour and after lot of experimenting.
I have two numpy arrays x and y and I am using seaborn's jointplot to plot them:
sns.jointplot(x, y)
Now I want to label the xaxis and yaxis as "X-axis label" and "Y-axis label" respectively. If I use plt.xlabel, the labels goes to the marginal distribution. How can I make them appear on the joint axes?
sns.jointplot returns a JointGrid object, which gives you access to the matplotlib axes and you can then manipulate from there.
import seaborn as sns
import numpy as np
# example data
X = np.random.randn(1000,)
Y = 0.2 * np.random.randn(1000) + 0.5
h = sns.jointplot(X, Y)
# JointGrid has a convenience function
h.set_axis_labels('x', 'y', fontsize=16)
# or set labels via the axes objects
h.ax_joint.set_xlabel('new x label', fontweight='bold')
# also possible to manipulate the histogram plots this way, e.g.
h.ax_marg_y.grid('on') # with ugly consequences...
# labels appear outside of plot area, so auto-adjust
h.figure.tight_layout()
(The problem with your attempt is that functions such as plt.xlabel("text") operate on the current axis, which is not the central one in sns.jointplot; but the object-oriented interface is more specific as to what it will operate on).
Note that the last command uses the figure attribute of the JointGrid. The initial version of this answer used the simpler - but not object-oriented - approach via the matplotlib.pyplot interface.
To use the pyplot interface:
import matplotlib.pyplot as plt
plt.tight_layout()
Alternatively, you can specify the axes labels in a pandas DataFrame in the call to jointplot.
import pandas as pd
import seaborn as sns
x = ...
y = ...
data = pd.DataFrame({
'X-axis label': x,
'Y-axis label': y,
})
sns.jointplot(x='X-axis label', y='Y-axis label', data=data)

Joining the points in a scatter plot

I’ve a scatter plot which almost looks like a circle. I would like to join the outer points with a line to show that almost circle like shape. Is there a way to do that in matplotlib?
You can use ConvexHull from scipy.spatial to find the outer points of your scatter plot and then connect these points using a PolyCollection from matplotlib.collections:
from matplotlib import pyplot as plt
import numpy as np
from scipy.spatial import ConvexHull
from matplotlib.collections import PolyCollection
fig, ax = plt.subplots()
length = 1000
#using some normally distributed data as example:
x = np.random.normal(0, 1, length)
y = np.random.normal(0, 1, length)
points = np.concatenate([x,y]).reshape((2,length)).T
hull = ConvexHull(points)
ax.scatter(x,y)
ax.add_collection(PolyCollection(
[points[hull.vertices,:]],
edgecolors='r',
facecolors='w',
linewidths=2,
zorder=-1,
))
plt.show()
The result looks like this:
EDIT
Actually, you can skip the PolyCollection and just do a simple line plot using the hull vertices. You only have to make the line circular by appending the first vertex to the list of vertices (making that list one element longer):
circular_hull_verts = np.append(hull.vertices,hull.vertices[0])
ax.plot(
x[circular_hull_verts], y[circular_hull_verts], 'r-', lw=2, zorder=-1,
)
EDIT 2:
I noticed that there is an example in the scipy documentation that looks quite similar to mine.

Tick labels displaying outside axis limits

Is there a way to automatically not display tick mark labels if they would protrude past the axis itself? For example, consider the following code
#!/usr/bin/python
import pylab as P, numpy as N, math as M
xvals=N.arange(-10,10,0.1)
yvals=[ M.sin(x) for x in xvals ]
P.plot( xvals, yvals )
P.show()
See how the -10 and 10 labels on the x-axis poke out to the left and right of the plot? And similar for the -1.0 and 1.0 labels on the y-axis. Can I automatically suppress plotting these but retain the ones that do not go outside the plot limits?
I think you could just format the axis ticks yourself and then prune the ones
that are hanging over. The recommended way to deal with setting up the axis is
to use the ticker API. So for example
from matplotlib.ticker import MaxNLocator
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111)
xvals=np.arange(-10,10,0.1)
yvals=[ np.sin(x) for x in xvals ]
ax.plot( xvals, yvals )
ax.xaxis.set_major_locator(MaxNLocator(prune='both'))
plt.show()
Here we are creating a figure and axes, plotting the data, and then setting the xaxis
major ticks. The formatter MaxNLocator is given the
argument prune='both' which is described in the docs here.
This is not exactly what you were asking for, but maybe it will solve your problem.