Saving an imshow-like image while preserving resolution - numpy

I have an (n, m) array that I've been visualizing with matplotlib.pyplot.imshow. I'd like to save this data in some type of raster graphics file (e.g. a png) so that:
The colors are the ones shown with imshow
Each element of the underlying array is exactly one pixel in the saved image -- meaning that if the underlying array is (n, m) elements, the image is NxM pixels. (I'm not interested in interpolation='nearest' in imshow.)
There is nothing in the saved image except for the pixels corresponding to the data in the array. (I.e. there's no white space around the edges, axes, etc.)
How can I do this?
I've seen some code that can kind of do this by using interpolation='nearest' and forcing matplotlib to (grudgingly) turn off axes, whitespace, etc. However, there must be some way to do this more directly -- maybe with PIL? After all, I have the underlying data. If I can get an RGB value for each element of the underlying array, then I can save it with PIL. Is there some way to extract the RGB data from imshow? I can write my own code to map the array values to RGB values, but I don't want to reinvent the wheel, since that functionality already exists in matplotlib.

As you already guessed there is no need to create a figure. You basically need three steps. Normalize your data, apply the colormap, save the image. matplotlib provides all the necessary functionality:
import numpy as np
import matplotlib.pyplot as plt
# some data (512x512)
import scipy.misc
data = scipy.misc.lena()
# a colormap and a normalization instance
cmap = plt.cm.jet
norm = plt.Normalize(vmin=data.min(), vmax=data.max())
# map the normalized data to colors
# image is now RGBA (512x512x4)
image = cmap(norm(data))
# save the image
plt.imsave('test.png', image)
While the code above explains the single steps, you can also let imsave do all three steps (similar to imshow):
plt.imsave('test.png', data, cmap=cmap)
Result (test.png):

Related

Plotting xarray.DataArray and Geopandas together - aspect ratio errors

I am trying to create two images side by side: one satellite image alone, and next to it, the same satellite image with outlines of agricultural fields. My raster data "raster_clip" is loaded into rioxarray (original satellite image from NAIP, converted from .sid to .tif), and my vector data "ag_clip" is in geopandas. My code is as follows:
fig, (ax1, ax2) = plt.subplots(ncols = 2, figsize=(14,8))
raster_clip.plot.imshow(ax=ax1)
raster_clip.plot.imshow(ax=ax2)
ag_clip.boundary.plot(ax=ax1, color="yellow")
I can't seem to figure out how to get the y axes in each plot to be the same. When the vector data is excluded, then the two plots end up the same shape and size.
I have tried the following:
Setting sharey=True in the subplots method. Doesn't affect shape of resulting images, just removes the tic labels on the second image.
Setting "aspect='equal'" in the imshow method, leads to an error, which doesn't make sense because the 'aspect' kwarg is listed in the documentation for xarray.plot.imshow.
plt.imshow's 'aspect' kwarg is not available in xarray
Removing the "figsize" variable, doesn't affect the ratio of the two plots.
not entirely related to your question but i've used cartopy before for overlaying a GeoDataFrame to a DataArray
plt.figure(figsize=(16, 8))
ax = plt.subplot(projection=ccrs.PlateCarree())
ds.plot(ax=ax)
gdf.plot(ax=ax)

Adding a Rectangle Patch and Text Patch to 3D Collection in Matplotlib

Problem Statement
I'm attempting to add two patches -- a rectangle patch and a text patch -- to the same space within a 3D plot. The ultimate goal is to annotate the rectangle patch with a corresponding value (about 20 rectangles across 4 planes -- see Figure 3). The following code does not get all the way there, but does demonstrate a rendering issue where sometimes the text patch is completely visible and sometimes it isn't -- interestingly, if the string doesn't extend outside the rectangle patch, it never seems to become visible at all. The only difference between Figures 1 and 2 is the rotation of the plot viewer image. I've left the cmap code in the example below because it's a requirement of the project (and just in case it affects the outcome).
Things I've Tried
Reversing the order that the patches are drawn.
Applying zorder values -- I think art3d.pathpatch_2d_to_3d is overriding that.
Creating a patch collection -- I can't seem to find a way to add the rectangle patch and the text patch to the same 3D collection.
Conclusion
I suspect that setting zorder to each patch before adding them to a 3D collection may be the solution, but I can't seem to find a way to get to that outcome. Similar questions suggest this, but I haven't been able to apply their answers to this problem specifically.
Environment
macOS: Big Sur 11.2.3
Python 3.8
Matplotlib 3.3.4
Figure 1
Figure 2
Figure 3
The Code
Generates Figures 1 and 2 (not 3).
#! /usr/bin/env python3
# -*- coding: utf-8 -*-
from matplotlib.patches import Rectangle, PathPatch
from matplotlib.text import TextPath
from matplotlib.transforms import Affine2D
import mpl_toolkits.mplot3d.art3d as art3d
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
plt.style.use('dark_background')
fig = plt.figure()
ax = fig.gca(projection='3d')
cmap = plt.cm.bwr
norm = Normalize(vmin=50, vmax=80)
base_color = cmap(norm(50))
# Draw box
box = Rectangle((25, 25), width=50, height=50, color=cmap(norm(62)), ec='black', alpha=1)
ax.add_patch(box)
art3d.pathpatch_2d_to_3d(box, z=1, zdir="z")
# Draw text
text_path = TextPath((60, 50), "xxxx", size=10)
trans = Affine2D().rotate(0).translate(0, 1)
p1 = PathPatch(trans.transform_path(text_path))
ax.add_patch(p1)
art3d.pathpatch_2d_to_3d(p1, z=1, zdir="z")
ax.set_xlabel('x')
ax.set_xlim(0, 100)
ax.set_xticklabels([])
ax.xaxis.set_pane_color(base_color)
ax.set_ylabel('y')
ax.set_ylim(0, 100)
ax.set_yticklabels([])
ax.yaxis.set_pane_color(base_color)
ax.set_zlabel('z')
ax.set_zlim(1, 4)
ax.set_zticks([1, 2, 3, 4])
ax.zaxis.set_pane_color(base_color)
ax.set_zticklabels([])
plt.show()
This is a well-known problem with matplotlib 3D plotting: objects are drawn in a particular order, and those plotted last appear on "top" of the others, regardless of which should be in front in a "true" 3D plot.
See the FAQ here: https://matplotlib.org/mpl_toolkits/mplot3d/faq.html#my-3d-plot-doesn-t-look-right-at-certain-viewing-angles
My 3D plot doesn’t look right at certain viewing angles
This is probably the most commonly reported issue with mplot3d. The problem is that – from some viewing angles – a 3D object would appear in front of another object, even though it is physically behind it. This can result in plots that do not look “physically correct.”
Unfortunately, while some work is being done to reduce the occurrence of this artifact, it is currently an intractable problem, and can not be fully solved until matplotlib supports 3D graphics rendering at its core.
The problem occurs due to the reduction of 3D data down to 2D + z-order scalar. A single value represents the 3rd dimension for all parts of 3D objects in a collection. Therefore, when the bounding boxes of two collections intersect, it becomes possible for this artifact to occur. Furthermore, the intersection of two 3D objects (such as polygons or patches) can not be rendered properly in matplotlib’s 2D rendering engine.
This problem will likely not be solved until OpenGL support is added to all of the backends (patches are greatly welcomed). Until then, if you need complex 3D scenes, we recommend using MayaVi.

Slicing the channels of image and storing the channels into numpy array(same size as image). Plotting the numpy array not giving the original image

I separated the 3 channels of an colour image. I created a new NumPy array of the same size as the image, and stored the 3 channels of the image into 3 slices of the 3D NumPy array. After plotting the NumPy array, the plotted image is not same as original image. Why is this happening?
Both img and new_img array have same elements, but image is different.
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
img=mpimg.imread('/storage/emulated/0/1sumint/kali5.jpg')
new_img=np.empty(img.shape)
new_img[:,:,0]=img[:,:,0]
new_img[:,:,1]=img[:,:,1]
new_img[:,:,2]=img[:,:,2]
plt.imshow(new_img)
plt.show()
Expect the same image as original image.
The problem is that your new image will be created with the default data type of float64 on this line:
new_img=np.empty(img.shape)
unless you specify a different dtype.
You can either (best) copy the original image's dtype like this:
new_img = np.empty(im.shape, dtype=img.dtype)
or use something like this:
new_img = np.zeros_like(im)
or (worst) specify one you happen to know matches your data, like this,
new_img = np.empty(im.shape, dtype=np.uint8)
I presume you have some reason for copying one channel at a time, but if not, you can avoid all the foregoing issues and just do:
new_img = np.copy(img)

pandas.plot and pyplot.save_fig create different sized PNGs for same figsize

When I call the same function that uses pandas.plot with the same figsize, I get different sized PNG files. The width is same but the height in pixels changes. I suspect that the length of the x-axis labels changes the height.I have not yet tried directly calling the matplotlib functions.
I have also tried plt.rcParams['figure.figsize'] = (7,4). The problem does not appear to be in how figsize is set. My print_fig_info always produces the desire values.
# Primitive way that confirmed that the figure size does not change
def print_fig_info(label=""):
print(label,str(plt.gcf().get_size_inches()))
def my_plot(df):
global c
print_fig_info("Before plot")
df.plot(kind='bar', figsize=(7,4))
print_fig_info("After plot")
# want to make output files unique
c += 1
plt.savefig("output"+str(c), bbox_inches='tight', dpi='figure')
In your call to savefig you explicitely ask matplotlib to change the figsize to the minimal size that still fits all the elements in via bbox_inches='tight'.
Or in other words, bbox_inches='tight' is especially designed for changing the figure size to the minimum bounding box, and matplotlib is therefore doing what it's being asked for.
Solution: Don't use bbox_inches='tight'.

Figures with lots of data points in matplotlib

I generated the attached image using matplotlib (png format). I would like to use eps or pdf, but I find that with all the data points, the figure is really slow to render on the screen. Other than just plotting less of the data, is there anyway to optimize it so that it loads faster?
I think you have three options:
As you mentioned yourself, you can plot fewer points. For the plot you showed in your question I think it would be fine to only plot every other point.
As #tcaswell stated in his comment, you can use a line instead of points which will be rendered more efficiently.
You could rasterize the blue dots. Matplotlib allows you to selectively rasterize single artists, so if you pass rasterized=True to the plotting command you will get a bitmapped version of the points in the output file. This will be way faster to load at the price of limited zooming due to the resolution of the bitmap. (Note that the axes and all the other elements of the plot will remain as vector graphics and font elements).
First, if you want to show a "trend" in your plot , and considering the x,y arrays you are plotting are "huge" you could apply a random sub-sampling to your x,y arrays, as a fraction of your data:
import numpy as np
import matplotlib.pyplot as plt
fraction = 0.50
x_resampled = []
y_resampled = []
for k in range(0,len(x)):
if np.random.rand() < fraction:
x_resampled.append(x[k])
y_resampled.append(y[k])
plt.scatter(x_resampled,y_resampled , s=6)
plt.show()
Second, have you considered using log-scale in the x-axis to increase visibility?
In this example, only the plotting area is rasterized, the axis are still in vector format:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.uniform(size=400000)
y = np.random.uniform(size=400000)
plt.scatter(x, y, marker='x', rasterized=False)
plt.savefig("norm.pdf", format='pdf')