How do I draw a resized image in TensorFlow? - tensorflow

It seems like images in TensorFlow get transformed to a different kind of image coordinate system after any transformation (e.g. resize) is applied. Drawing the image results in this:
%matplotlib inline
import tensorflow as tf
from matplotlib import pyplot as plt
with tf.device("/cpu:0"):
file_contents = tf.read_file('any_image.png')
image = tf.image.decode_png(file_contents)
image.set_shape([375, 1242, 3])
image = tf.image.resize_images(image, 448, 448)
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
image_val = sess.run([image])
plt.figure(figsize=(16, 8))
plt.imshow(image_val[0], interpolation='nearest')
plt.show()
plt.close()
If I remove the resize operation it draws the regular image. How do I get matplotlib to draw the resized image correctly or tell Tensorflow to return it to RGB?

Seems like there is no image transformation besides unsigned integer to float. Converting back to unsigned integer fixed the problem.
plt.imshow(image_val[0].astype(np.uint8), interpolation='nearest')

Related

Preprocessing layers with seed not producing the same data augmentation for images and masks

I'm trying to create a simple preprocessing augmentation layer, following this Tensorflow tutorial. I created this 'simple' example that shows the problem I'm having.
Even though I'm initializing the augmentation class with a seed, operations applied to the images, and the corresponding masks are not always equal.
What am I doing wrong?
Note: tf v2.10.0
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import skimage
import rasterio as rio
def normalize(array: np.ndarray):
""" normalise image to give a meaningful output """
array_min, array_max = array.min(), array.max()
return (array - array_min) / (array_max - array_min)
# field
im = rio.open('penguins.tif')
fields = np.zeros((1,im.shape[0],im.shape[1],3))
fields[0,:,:,0] = normalize(im.read(1))
fields[0,:,:,1] = normalize(im.read(2))
fields[0,:,:,2] = normalize(im.read(3))
# mask is a simple contour
masks = skimage.color.rgb2gray(skimage.filters.sobel(fields[0]))
masks = np.expand_dims(masks, [0,3])
In this case, the dataset is only one image, we can use this function to visualize the field and the mask.
def show(field:np.ndarray, mask:np.ndarray):
"""Show the field and corresponding mask."""
fig = plt.figure(figsize=(8,6))
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
ax1.imshow(field[:,:,:3])
ax2.imshow(mask,cmap='binary')
plt.tight_layout()
plt.show()
show(fields[0], masks[0])
Alright, now I used the example from the tutorial that will randomly flip (horizontal) the image and the mask.
class Augment(tf.keras.layers.Layer):
def __init__(self, seed=42):
super().__init__()
# both use the same seed, so they'll make the same random changes.
self.augment_inputs = tf.keras.layers.RandomFlip(mode="horizontal", seed=seed)
self.augment_labels = tf.keras.layers.RandomFlip(mode="horizontal", seed=seed)
def call(self, inputs, labels):
inputs = self.augment_inputs(inputs)
labels = self.augment_labels(labels)
return inputs, labels
Now if I run the following multiple times, I will eventually get opposite flip on the field and mask.
# Create a tf.datasets
ds = tf.data.Dataset.from_tensor_slices((fields, masks))
ds2 = ds.map(Augment())
for f,m in ds2.take(1):
show(f, m)
I would expect the image and its mask to be flip the same way since I set the seed in the Augment class as suggested in the Tensorflow tutorial.
Augmentation can be done on the concatenated image and mask along the channel axis to form a single array and then recover the image and label back, which is shown below:
class Augment(tf.keras.layers.Layer):
def __init__(self):
super().__init__()
# both use the same seed, so they'll make the same random changes.
self.augment_inputs = tf.keras.layers.RandomRotation(0.3)
def call(self, inputs, labels):
output = self.augment_inputs(tf.concat([inputs, labels], -1) )
inputs = output[:,:,0:4]
labels = output[:,:,4:]
return inputs, labels

Rotating image and its key points label in tensorflow2.0

I am trying to add rotation to my dataset of images where the labels have some facial keypoints. tf.contrib is removed from tensorflow 2.0 and any other library like PIL does not work as I am using tf.data.Dataset.
I need angle rotated to be random while the same rotation needs to be applied to both an image and its keypoint labels as well. Is there a way to do this in tensorflow 2.0?
Below is the function I used:
def preprocess_data(image, angle):
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [input_size, input_size])
image = tf.image.rgb_to_grayscale(image)
image = Image.fromarray(np.array(tf.squeeze(image)))
rotated = Image.Image.rotate(image, angle)
image = tf.convert_to_tensor(np.array(rotated))
image = tf.expand_dims(image, -1)
return image
def load_and_preprocess_data(path):
image = tf.io.read_file(path)
rotation = tf.random.uniform([1,1], minval=-60, maxval=60, seed=0)
return preprocess_data(image, rotation)
Here I used PIL but it is not working when I try to map a tf.data.Dataset containing image paths to load_and_preprocess_data function.

Transform 3D Tensor to 4D

I am using the VGG16 Model, which expects a 4D Tensor as input. When I call model.fit(xtrain, ytrain, ...) my xtrain is a list of 3D Tensor [size, size, features] - so in this case: [224,224,3]
What I want is 4D Tensors with [len(images), size, size, features]
How could I modify my code to get there?
I tried tf.expand_dims and tf.concant but it didn't work.
# Transforming my image to a 3D Tensor
image = tf.io.read_file(image)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [IMG_SIZE, IMG_SIZE])
image = image / 255.0
Error msg after model.fit:
Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (224, 224, 3)
It looks like you are reading in only a single image and passing that. If that's the case, you can add a dimension of 1 to the first axis of the image. There's lots of ways to do that.
Using reshape:
image = image.reshape(1, 224, 224, 3)
Using some fancy numpy slicing notation to add an axis (personal favorite):
image = image[None, ...]
Using numpy.expand_dims() as explained in Abhijit's answer.
I imagine you want to be reading a bunch of images in though. Possibly an issue with your input process? Can you wrap your read in a loop and read multiple files? Something like:
images = []
for file in image_files:
image = tf.io.read_file(file)
# ...
images.append(image)
images = np.asarray(images)
numpy.expand_dims(image, axis=0)

How to multiply input images with mask in tensorflow?

I want to multiply every input image with a mask of the same size as the input image. How would I do that in tensorflow?
My image reading function looks like this so far:
img_contents = tf.read_file(input_queue[0])
label_contents = tf.read_file(input_queue[1])
img = tf.image.decode_png(img_contents, channels=3)
label = tf.image.decode_png(label_contents, channels=1)
# Now I want to do something like this?
mask = tf.constant(1.0, dtype=tf.float32, shape=img.shape)
img_masked = tf.multiply(img,mask)
Is that possible?
Not sure if img is already a tensor object and I can use that function here. I'm new to tensorflow...
Here is the code that works well for me. I'm using jupyter notebook to run the code.
%matplotlib inline
import tensorflow as tf
from matplotlib.image import imread
import matplotlib.pyplot as plt
# Loading test image from the local filesystem
x = tf.Variable(imread("test_img.jpg"),dtype='float32')
x_mask = tf.Variable(imread("test_mask.jpg"),dtype='float32')
img_mult = tf.multiply(x,x_mask)
plt.imshow(imread("test_img.jpg"))
plt.show()
plt.imshow(imread("test_mask.jpg"))
plt.show()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
res = sess.run(img_mult)
plt.imshow(res)
plt.show()
Also, Here is a good YouTube tutorial covering image manipulation with TF: https://www.youtube.com/watch?v=bvHgESVuS6Q&t=447s

TensorFlow MNIST example feeding own images

I am trying to learn TensorFlow, so I was trying to understand their example with smaller dimensions. Suppose I have image1, image2, image3 three 28x28 matrices which hold grayscale values (0..255). image1 is the training image, image2 is the validation image, and image3 is the test image. I was trying to understand how I can feed my own images into the MNIST example they have here.
I am particularly interested in replacing the following line with my own imageset:
X, Y, testX, testY = mnist.load_data(one_hot=True)
Your help is much appreciated.
Suppose your image is a numpy array, of shape [1, 28, 28, 1].
You can just feed this numpy array to the node X or textX. Even though X is not a placeholder, you can provide its value to TensorFlow.
X_value = ... # numpy array
# ... same for Y_value, testX_value, testY_value
feed_dict = {X: X_value, Y: Y_value, testX: testX_value, testY: testY_value}
sess.run(train_op, feed_dict=feed_dict)
mnist.load_data(one_hot=True) is nothing but some preprossesing of the data. If you have some images in hand, you can just make them an ndarray and feed into the graph. For examples if you have a node named images, you can feed the images using feed_dict = {images: some_image}.