my picture after using tf.image.resize_images becomes horrible picture - tensorflow

original_picture (size:128*128) like this:
after using this function
image = tf.image.resize_images(original_image,(128,128))
finally I use plt.imshow() to show my hand picture

The problem comes from tensorflow's resize_images function returning floats.
To properly resize and view the image you would need something like:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
with tf.Session() as sess:
tf.global_variables_initializer().run()
image = tf.image.resize_images(original_image,(128,128))
# Cast image to np.uint8 so it can be properly viewed
# eval() tensor to get numpy array.
image = tf.cast(image, np.uint8).eval()
plt.imshow(image)

The colours are inverted, i.e. each pixel's colour [r, g, b] is being displayed as [255 - r, 255 - g, 255 - b].
This could have something to do with the data type of the image you obtain in step 2. Try the following after resizing the image:
image = image.astype(np.uint8)

I will be using tensorflow library as tf.
tf.image.resize resizes the images(correctly) and then when we use plt.imshow on it .
But plt.imshow if it sees a float value be it .5 or 221.3 it clips that into the range[0,1].
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
This was the problem in my case ,
Original Image pixels[ 91 105 166] .
After resizing tf.Tensor([ 91.01 105.01 166.01], shape=(3,), dtype=float32)
You can see that the resizing is correct but the clipping is the one hurting .
To use the function properly .
img_resize = tf.image.resize(random_img,[250,250])
img_resize = tf.cast(img_resize,'int64')
plt.imshow(img_resize)
This should take care of the issues .

Related

Why does image_dataset_from_directory return a different array than loading images normally?

I noticed that the output from TensorFlow's image_dataset_from_directory is different than directly loading images (either by PIL, Keras' load_img, etc.). I set up an experiment: I have a single RGB image with dimensions 2400x1800x3, and tried comparing the resulting numpy arrays from the different methods:
from PIL import Image
from tensorflow.keras.utils import image_dataset_from_directory, load_img, img_to_array
train_set = image_dataset_from_directory(
'../data/',
image_size=(2400, 1800), # I'm using original image size
label_mode=None,
batch_size=1
)
for batch in train_set:
img_from_dataset = np.squeeze(batch.numpy()) # remove batch dimension
img_from_keras = img_to_array(load_img(img_path))
img_from_pil = img_to_array(Image.open(img_path))
print(np.all(img_from_dataset == img_from_keras)) # False
print(np.all(img_from_dataset == img_from_pil)) # False
print(np.all(img_from_keras == img_from_pil)) # True
So, even though all methods return the same shape numpy array, the values from image_dataset_from_directory are different. Why is this? And what can/should I do about it?
This is a particular problem during prediction time where I'm taking a single image (i.e. not using image_dataset_from_directory to load the image).
This is strange but I have not figured out exactly why but if you print out a pixel values from the img_from_dataset, img_from_keras and img_from_pil I found that the pixel values for img_from_data are sometimes lower by 1, that is it looks like some kind of rounding is going on. All 3 are supposed to return float32 so I can't see why they should be different. I also tried using
ImageDataGenerator().flow_from_directory and it matches the data for img_from_keras and img_from_pil. Now img_from_dataset return a A tf.data.Dataset object it yields float32 tensors of shape (batch_size, image_size[0], image_size[1], num_channels).
I used this code to detect the pixel value difference where I used a 224 X 224 X3 image
match=True
for i in range(224):
for j in range(224):
for k in range (3):
if img_from_dataset[i,j,k] != img_from_keras[i,j,k]:
match=False
print(img_from_dataset[i,j,k], img_from_keras[i,j,k], i, j, k)
break
if match==False:
break
if match == False:
break
print(match)
An example output of the code is
86.0 87.0 0 0 2
False
If you ever figured out why the difference let me know. I expect one will have to go through the detailed code. I took a quick look. Even though you specified the image size as being the same as the original image, image_dataset_from_directory still resizes the image using tf.image.resize with the iterpolation as interpolation='bilinear'. Maybe the load_img(img_path) and PIL image.open use different interpolations.

How to remove black canvas from image in TensorFlow

I'm currenly trying working with tensorflow dataset 'tf_flowers', and noticed that a lot of images consist mostly of black canvas, like this:
flower1
flower2
Is there any easy way to remove/or filter it out? Preferably it should work on batches, and compile into a graph with #tf.function, as I plan to use it also for bigger datasets with dataset.map(...)
The black pixels are just because of padding. This is a simple operation that allows you to have network inputs having the same size (i.e. you have batches containing images with the of size: 223x221 because smaller images are padded with black pixels).
An alternative to padding that removes the need of adding black pixels to the image, is that of preprocessing the images by:
removing padding via cropping operation
resizing the cropped images to the same size (e.g. 223x221)
You can do all of these operations in simple python, thanks to tensorflow map function. First, define your python function
def py_preprocess_image(numpy_image):
input_size = numpy_image.shape # this is (223, 221)
image_proc = crop_by_removing_padding(numpy_image)
image_proc = resize(image_proc, size=input_size)
return image_proc
Then, given your tensorflow dataset train_data, map the above python function on each input:
# train_data is your tensorflow dataset
train_data = train_data.map(
lambda x: tf.py_func(preprocess_image,
inp = [x], Tout=[tf.float32]),
num_parallel_calls=num_threads
)
Now, you only need to define crop_by_removing_padding and resize, which operate on ordinary numpy arrays and can thus be written in pure python code. For example:
def crop_by_removing_padding(img):
xmax, ymax = np.max(np.argwhere(img), axis=0)
img_crop = img[:xmax + 1, :ymax + 1]
return img_crop
def resize(img, new_size):
img_rs = cv2.resize(img, (new_size[1], new_size[0]), interpolation=cv2.INTER_CUBIC)
return img_rs

After applying torchvision.transforms on mnsit dataset, how to view it using cv2_imshow?

I am trying to implement a simple GAN in google collaboratory, After using transforms to normalize the images, I want to view it at the output end to display fake image generated by the generator and real image side by in the dataset once every batch iteration like a video.
transform = transforms.Compose(
[
# Convert a PIL Image or numpy.ndarray to tensor. This transform does not support torchscript.
# Converts a PIL Image or numpy.ndarray (H x W x C) in the range
# [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
transforms.ToTensor(),
# Normalize a tensor image with mean and standard deviation.
transforms.Normalize((0.5,),(0.5,))
])
dataset = datasets.MNIST(root="dataset/", transform=transform, download=True)
loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
After applying transforms on the dataset it is not in the range of [0,255] anymore. How do we denormalize it and use cv2_imshow to show that series of real and fake images frame by frame in the same place?
The above image shows the output I get, there are two problems here.
The image normalization, rendered the image indistinguishable, it is just all black.
The images are not coming frame by frame in the same place like a video, instead, it is getting printed in a new line.
What approach do I take to solve these issues?
Problem 1
Assuming torch_image is a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]:
numpy_image = torch_image.permute(1, 2, 0).numpy() * 255
You can then display numpy_image with cv2.
Problem 2
If you want to refresh the printed images instead of printing new ones, you might try the solution provided here:
https://stackoverflow.com/a/52866695/12463260
Found that I didn't denormalize.
def denormalize(x):
# Denormalizeing
pixels = ((x *.5)+.5)*255
return pixels
The above function did, to convert it back to the range [0,255].
I didn't find any solution for problem 2 yet.

Convert an image format from 32FC1 to 16UC1

I need to encode an image in 16UC1 format, but I receive the error:
cv_bridge.core.CvBridgeError:encoding specified as 16UC1, but image has incompatible type 32FC1
I tried to use skimage function img_as_uint but since my image values are not between -1 and 1 it doesn't work. i also tried to "normalize" my values by dividing all of them by the value obtained from np.amax, but using the skimage function only returns a blank image.
Is there a way of achieving this conversion?
This is the original 32FC1 image
With numpy you should be able to:
import numpy as np
img = np.random.normal(0, 1, (300, 300, 3)).astype(np.float32) # simulated image
uimg = img.astype(np.uint16)
You probably will first want to do some kind of normalization if it isn't already in an unsigned range. Probably something like:
img_normalized = (img-img.min())/(img.max()-img.min())*256**2
But your normalization strategy will depend on what you want to accomplish.
Thanks for sharing an image. I can visualize as follows:
import numpy as np
import matplotlib.pyplot as plt
arr = np.load('32FC1_image.npz')
img = arr['arr_0']
img = np.squeeze(img) # this gets rid of the extra dimensions that are causing matplotlib to not recognize it as an image, the extra dimensions also may be causing your problems
img_normalized = (img-img.min())/(img.max()-img.min())*256**2
img_normalized = img_normalized.astype(np.uint16)
plt.imshow(img_normalized)
Try using the normalized 16 bit image.

how to convert flattened array of RGB image(1-D) back to original image

I have flattened 1D array of (1*3072) created from RGB image of dimension(32*32*3). I want to extract back the original RGB image of dimension(32*32*3) and plot it.
I have tried the solution suggested in how to convert a 1-dimensional image array to PIL image in Python
But it's not working for me, As it seems it is for a greyscale image
from PIL import Image
from numpy import array
img = Image.open("sampleImage.jpg")
arr = array(img)
arr = arr.flatten()
print(arr.shape)
#tried with 'L' & 'RGB' both
img2 = Image.fromarray(arr.reshape(200,300), 'RGB')
plt.imshow(img2, interpolation='nearest')
plt.show()
"Getting below error which expected because it is not able covert RGB"
ValueError: cannot reshape array of size 180000 into shape (200,300)
In order to interpret an array as an RGB image, it needs to have 3 channels. A channel is the 3rd dimension in the numpy array. So change your code to this:
img2 = Image.fromarray(arr.reshape(200,300,3), 'RGB')
I should mention that you talk about your flattened array being 1x3072, yet the example code seems to assume 200x300x3, which would be 1x180,000 when flattened. Which of these two is the truth, I couldn't tell you.