PIL to numpy and PIL to tensor is different - numpy

I have image and sum torch tensor and numpy array is diffrent, why?
How to torch_img.sum() = numpy_float_img.sum()?
from PIL import Image
from torchvision import transforms as T
# Read image with PIL
img = Image.open(img_path).resize((224,224))
torch_img = T.ToTensor()(img)
numpy_img = np.asarray(img)
numpy_img_float = np.asarray(img).astype(np.float32)
print(torch_img.sum(), numpy_img.sum(), numpy_img_float.sum())
->56914.496, 14513196, 14513196.0
Does anyone have any idea why?

Notice how torch_img is in the [0,1] range while numpy_img and numpy_img_float are both in the [0, 255] range. Looking at the documentation for torchvision.transforms.ToTensor, if the provided input is a PIL image, then the values will be mapped to [0, 1]. In contrast, numpy.array will have the values remain in the [0, 255] range.
Other than that the small variations in results are caused by different floating-point precisions.

Related

Applying gaussian blur to images in a loop

I have a simple ndarray with shape as:
import matplotlib.pyplot as plt
%matplotlib inline
plt.imshow(trainImg[0]) #can display a sample image
print(trainImg.shape) : (4750, 128, 128, 3) #shape of the dataset
I intend to apply Gaussian blur to all the images. The for loop I went with:
trainImg_New = np.empty((4750, 128, 128,3))
for idx, img in enumerate(trainImg):
trainImg_New[idx] = cv2.GaussianBlur(img, (5, 5), 0)
I tried to display a sample blurred image as:
plt.imshow(trainImg_New[0]) #view a sample blurred image
but I get an error:
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
It just displays a blank image.
TL;DR:
The error is most likely caused by trainImg_New is float datatype and its value is larger than 1. So, as #Frightera mentioned, try using np.uint8 to convert images' datatype.
I tested the snippets as below:
import numpy as np
import matplotlib.pyplot as plt
import cv2
trainImg_New = np.random.rand(4750, 128, 128,3) # all value is in range [0, 1]
save = np.empty((4750, 128, 128,3))
for idx, img in enumerate(trainImg_New):
save[idx] = cv2.GaussianBlur(img, (5, 5), 0)
plt.imshow(np.float32(save[0]+255)) # Reported error as question
plt.imshow(np.float32(save[0]+10)) # Reported error as question
plt.imshow(np.uint8(save[0]+10)) # Good to go
First of all, cv2.GaussianBlur will not change the range of the arrays' value and the original image arrays's value is legitimate. So I believe the only reason is the datatype of the trainImg_New[0] is not match its range.
So I tested the snippets above, we can see when the datatype of trainImg_New[0] matter the available range of the arrays' value.
I suggest you use tfa.image.gaussian_filter2d from the tensorflow_addons package. I think you'll be able to pass all your images at once.
import tensorflow as tf
from skimage import data
import tensorflow_addons as tfa
import matplotlib.pyplot as plt
image = data.astronaut()
plt.imshow(image)
plt.show()
blurred = tfa.image.gaussian_filter2d(image,
filter_shape=(25, 25),
sigma=3.)
plt.imshow(blurred)
plt.show()

Cutting and resizing a numpy array to a new shape based on ROI

I have a numpy array and I need to cut a partition of it based on an ROI like (x1,y1)(x2,y2). The background color of the numpy array is zero.
I need to crop that part from the first numpy array and then resize the cropped array to (640,480) pixel.
I am new to numpy and I don't have any clue how to do this.
#numpy1: the first numpy array
roi=[(1,2),(3,4)]
It kind of sounds like you want to do some image processing. Therefore, I suggest you to have a look at the OpenCV library. In their Python implementation, images are basically NumPy arrays. So, cropping and resizing become quite easy:
import cv2
import numpy as np
# OpenCV images are NumPy arrays
img = cv2.imread('path/to/your/image.png') # Just use your NumPy array
# instead of loading some image
# Set up ROI [(x1, y1), (x2, y2)]
roi = [(40, 40), (120, 150)]
# ROI cutout of image
cutout = img[roi[0][1]:roi[1][1], roi[0][0]:roi[1][0], :]
# Generate new image from cutout with desired size
new_img = cv2.resize(cutout, (640, 480))
# Just some output for visualization
img = cv2.rectangle(img, roi[0], roi[1], (0, 255, 0), 2)
cv2.imshow('Original image with marked ROI', img)
cv2.imshow('Resized cutout of image', new_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.5
NumPy: 1.19.1
OpenCV: 4.4.0
----------------------------------------
You can crop an array like
array = array[start_x:stop_x, start_y:stop_y]
or in your case
array = array[roi[0][0]:roi[0][1], roi[1][0]:roi[1][1]]
or one of
array = array[slice(*roi[0]), slice(*roi[1])]
array = array[tuple(slice(*r) for r in roi)]
depending on the amount of abstraction and over-engineering that you need.
I recommend using slicing and skimage. skimage.transform.resize is what you need.
import matplotlib.pyplot as plt
from skimage import data
from skimage.transform import resize
image = data.camera()
crop = image[10:100, 10:100]
crop = resize(crop, (640, 480))
plt.imshow(crop)
More about slicing, pls see here.
Details on skimage, see here

Wrote a code in python to edit an image's background and the output i am getting is totally off

I edited it to view the foreground image on a white background but now, none of the images are visible.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('91_photo.jpg')
mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
rect = (10,10,360,480)
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
mask2 = np.where((mask==2)|(mask==0),0,255).astype('uint8')
img = img*mask2[:,:,np.newaxis]
plt.imshow(img),plt.colorbar(),plt.show()
Expecting the result to be a visible image on a white background
This is what i'm getting
There are a number of small issues with your code that are adding up to that weird result.
OpenCV uses BGR ordering of the channels of an image, where matplotlib uses RGB. That means if you read an image with OpenCV but want to display with matplotlib, you need to convert the image from BGR to RGB before displaying (that's the reason the colors are weird). Also, not that important, but color images are not displayed with a colormap, so showing the colormap does not do anything for you.
In numpy, it's best to keep masks boolean whenever you can, because you can use them to index your arrays. Your current code converts a boolean mask to a uint8 image with 0 and 255 values and then you multiply that with your image. That means your image will be set to zero wherever the mask is zero---and your image values will explode (or do weird stuff with overflow). Instead, keep the mask boolean and use it to index your array. That way anywhere the mask is True you can just set the value in your image to something specific (like 255 for white).
This should fix you up:
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('91_photo.jpg')
mask = np.zeros(img.shape[:2], np.uint8)
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)
rect = (10, 10, 360, 480)
cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_RECT)
mask2 = (mask==2) | (mask==0)
img[mask2] = 255
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)
plt.show()

Using perceptually uniform colormaps in Mayavi volumetric visualization

AFAIK Mayavi does not come with any perceptually uniform colormaps. I tried naively to just pass it one of Matplotlib's colormaps but it failed:
from mayavi import mlab
import multiprocessing
import matplotlib.pyplot as plt
plasma = plt.get_cmap('plasma')
...
mlab.pipeline.volume(..., colormap=plasma)
TraitError: Cannot set the undefined 'colormap' attribute of a 'VolumeFactory' object.
Edit: I found a guide to convert Matplotlib colormaps to Mayavi colormaps. However, it unfortunately doesn't work since I am trying to use a volume using a perceptually uniform colormap.
from matplotlib.cm import get_cmap
import numpy as np
from mayavi import mlab
values = np.linspace(0., 1., 256)
lut_dict = {}
lut_dict['plasma'] = get_cmap('plasma')(values.copy())
x, y, z = np.ogrid[-10:10:20j, -10:10:20j, -10:10:20j]
s = np.sin(x*y*z)/(x*y*z)
mlab.pipeline.volume(mlab.pipeline.scalar_field(s), vmin=0, vmax=0.8, colormap=lut_dict['plasma']) # still getting the same error
mlab.axes()
mlab.show()
...
Instead of setting it as the colormap argument, if you set it as the ColorTransferFunction of the volume, it works as expected.
import numpy as np
from mayavi import mlab
from tvtk.util import ctf
from matplotlib.pyplot import cm
values = np.linspace(0., 1., 256)
x, y, z = np.ogrid[-10:10:20j, -10:10:20j, -10:10:20j]
s = np.sin(x*y*z)/(x*y*z)
volume = mlab.pipeline.volume(mlab.pipeline.scalar_field(s), vmin=0, vmax=0.8)
# save the existing colormap
c = ctf.save_ctfs(volume._volume_property)
# change it with the colors of the new colormap
# in this case 'plasma'
c['rgb']=cm.get_cmap('plasma')(values.copy())
# load the color transfer function to the volume
ctf.load_ctfs(c, volume._volume_property)
# signal for update
volume.update_ctf = True
mlab.show()
While the previous answer by like444 helped me partially with a similar problem, it leads to incorrect translation between colormaps. This is because the format in which matplotlib and tvtk store color information is slightly different: Matplotlib uses RGBA, while ColorTransferFunction uses VRGB, where V is the value in the shown data that this part of the colormap is assigned to. So by doing a 1-to-1 copy, green becomes red, blue becomes green and alpha becomes blue. The following code snippet fixes that:
def cmap_to_ctf(cmap_name):
values = list(np.linspace(0, 1, 256))
cmap = cm.get_cmap(cmap_name)(values)
transfer_function = ctf.ColorTransferFunction()
for i, v in enumerate(values):
transfer_function.add_rgb_point(v, cmap[i, 0], cmap[i, 1], cmap[i, 2])
return transfer_function

Why doesn't the shape of my numpy array change?

I have made a numpy array out of data from an image. I want to convert the numpy array into a one-dimensional one.
import numpy as np
import matplotlib.image as img
if __name__ == '__main__':
my_image = img.imread("zebra.jpg")[:,:,0]
width, height = my_image.shape
my_image = np.array(my_image)
img_buffer = my_image.copy()
img_buffer = img_buffer.reshape(width * height)
print str(img_buffer.shape)
The 128x128 image is here.
However, this program prints out (128, 128). I want img_buffer to be a one-dimensional array though. How do I reshape this array? Why won't numpy actually reshape the array into a one-dimensional array?
.reshape returns a new array, rather than reshaping in place.
By the way, you appear to be trying to get a bytestring of the image - you probably want to use my_image.tostring() instead.
reshape doesn't work in place. Your code isn't working because you aren't assigning the value returned by reshape back to img_buffer.
If you want to flatten the array to one dimension, ravel or flatten might be easier options.
>>> img_buffer = img_buffer.ravel()
>>> img_buffer.shape
(16384,)
Otherwise, you'd want to do:
>>> img_buffer = img_buffer.reshape(np.product(img_buffer.shape))
>>> img_buffer.shape
(16384,)
Or, more succinctly:
>>> img_buffer = img_buffer.reshape(-1)
>>> img_buffer.shape
(16384,)