rescale image in tensorflow to fall between [0,1] - tensorflow

I am fairly new to tensorflow and I have a tflite model which needs inference on a single image (ie no datasets). The docs say the input should be 224,224,3 and scaled to [0,1] (https://www.tensorflow.org/lite/tutorials/model_maker_image_classification#advanced_usage), but I am having trouble doing this rescaling to [0,1].
Currently I have something like so:
img = tf.io.read_file(image_path)
img = tf.io.decode_image(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.uint8)
print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
The min and max and 0 and 255 respectively. I would like to scale this to [0,1]
I am on tf 2.5 and I do not see a builtin method to do this..
I tried doing this:
img = tf.io.read_file(image_path)
img = tf.io.decode_image(img, channels=3)
scale=1./255
img=img*scale
img = tf.image.convert_image_dtype(img, tf.uint8)
print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
and I get thrown:
TypeError: Cannot convert 0.00392156862745098 to EagerTensor of dtype uint8
I think there is some casting error :(

In order to avoid
TypeError: Cannot convert 0.00392156862745098 to EagerTensor of dtype uint8
error we have to cast img form tf.unit8 to tf.float32 like
img = tf.cast(img, dtype=tf.float32) / tf.constant(256, dtype=tf.float32)
print('min max img value', tf.reduce_min(img), tf.reduce_max(img))
Conversion an image tensor in tf.float32 normalized to scale [0, 1] to tf.uint8 is probably not a good idea.

Related

image preprocess function for image_dataset_from_directory

In the ImageDataGenerator, I've used the following function to preprocess images, through the keyword of 'preprocessing' in .flow_from_dataframe().
However, I am now trying to use the image_dataset_from_directory, which does not work with the preprocess function, as it does not allow embedding this function.
I've tried to apply the preprocess_image() function after the dataset is generated by image_dataset_from_directory, through .map() function, but it does not work either.
Please could anyone advise?
Many thanks,
Tony
train_Gen = dataGen.flow_from_dataframe(
df,
x_col='id_code',
y_col='diagnosis',
directory=os.path.join(data_dir, 'train_images'),
batch_size=BATCH_SIZE,
target_size=(IMG_WIDTH, IMG_HEIGHT),
subset='training',
seed=123,
class_mode='categorical',
**preprocessing=preprocess_image**,
)
def crop_image_from_gray(img, tol=7):
"""
Applies masks to the orignal image and
returns the a preprocessed image with
3 channels
:param img: A NumPy Array that will be cropped
:param tol: The tolerance used for masking
:return: A NumPy array containing the cropped image
"""
# If for some reason we only have two channels
if img.ndim == 2:
mask = img > tol
return img[np.ix_(mask.any(1),mask.any(0))]
# If we have a normal RGB images
elif img.ndim == 3:
gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
mask = gray_img > tol
check_shape = img[:,:,0][np.ix_(mask.any(1),mask.any(0))].shape[0]
if (check_shape == 0): # image is too dark so that we crop out everything,
return img # return original image
else:
img1=img[:,:,0][np.ix_(mask.any(1),mask.any(0))]
img2=img[:,:,1][np.ix_(mask.any(1),mask.any(0))]
img3=img[:,:,2][np.ix_(mask.any(1),mask.any(0))]
img = np.stack([img1,img2,img3],axis=-1)
return img
def preprocess_image(image, sigmaX=10):
"""
The whole preprocessing pipeline:
1. Read in image
2. Apply masks
3. Resize image to desired size
4. Add Gaussian noise to increase Robustness
:param img: A NumPy Array that will be cropped
:param sigmaX: Value used for add GaussianBlur to the image
:return: A NumPy array containing the preprocessed image
"""
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = crop_image_from_gray(image)
image = cv2.resize(image, (IMG_WIDTH, IMG_HEIGHT))
image = cv2.addWeighted (image,4, cv2.GaussianBlur(image, (0,0) ,sigmaX), -4, 128)
return image

Feeding tf.data Dataset with multidimensional output to Keras model

I want to feed a tf.data Dataset to a Keras model, but I get the following error:
AttributeError: 'DatasetV1Adapter' object has no attribute 'ndim'
This dataset will be used to solve a segmentation problem, so both input and output will be images (3D tensors)
The dataset is created with this code:
dataset = tf.data.Dataset.list_files(TRAIN_PATH + "*.png",shuffle=False)
def process_path(file_path):
img = tf.io.read_file(file_path)
img = tf.image.decode_png(img, channels=3)
train_image_path=tf.strings.regex_replace(file_path,"image","mask")
mask = tf.io.read_file(train_image_path)
mask = tf.image.decode_png(mask, channels=1)
mask = tf.squeeze(mask)
mask = tf.one_hot(tf.cast(mask, tf.int32), Num_Classes, axis = -1)
return img,mask
dataset = dataset.map(process_path)
dataset = dataset.batch(32,drop_remainder=True)
Taking an item from the dataset shows that I get a tuple containing an input tensor and an output tensor, whose dimensions are correct:
Input: (batch-size, image height, image width, 3 channels)
Output: (batch-size, image height, image width, 4 channels)
When fitting the model I get an error:
model.fit(dataset, epochs = 50)
I've solved the provem moving to Keras 2.4.3 and Tensorflow 2.2
Everything was right but apparently the previous release of Keras did not manage this tf.data correctly.
Here's a tutorial I've found very useful on this.

OpenCV - convert uint8 image to float32 normalized image

I'm trying to convert parts of a Keras DarkNet code to try to make the code run faster.
Here is the code I'm trying to optimize:
model_image_size = (416, 416)
import cv2
from PIL import Image
frame = cv2.imread("test.png", cv2.IMREAD_COLOR)
im = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
im = Image.fromarray(im).crop((1625, 785, 1920, 1080)) # crop ROI
resized_image = im.resize(tuple(reversed(model_image_size)), Image.BICUBIC)
image_data = np.array(resized_image, dtype='float32')
image_data /= 255.
image_data = np.expand_dims(image_data, 0) # Add batch dimension.
return image_data
This is my attempt to achieve the same output without using the intermediate PIL coversion to reduce time:
model_image_size = (416, 416)
import cv2
frame = cv2.imread("test.png", cv2.IMREAD_COLOR)
frame = frame[785:1080,1625:1920] # crop ROI
im = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
resized_image = cv2.resize(im, model_image_size, interpolation = cv2.INTER_CUBIC)
resized_image /= 255.
image_data = np.expand_dims(resized_image, 0) # Add batch dimension.
return image_data
However, upon running the code, it will return:
resized_image /= 255.
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'B') according to the casting rule ''same_kind''
It seems like I need to change the uint8 type to float32 before normalizing but I'm not sure how to achieve it with OpenCV.
You can use resized_image.astype(np.float32) to convert resized_image data from unit8 to float32 and then proceed with normalizing and other stuffs:
frame = cv2.imread("yourfile.png")
frame = frame[200:500,400:1000] # crop ROI
im = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
model_image_size = (416, 416)
resized_image = cv2.resize(im, model_image_size, interpolation = cv2.INTER_CUBIC)
resized_image = resized_image.astype(np.float32)
resized_image /= 255.
image_data = np.expand_dims(resized_image, 0) # Add batch dimension.
Your issue is that you are dividing and assigning to the same variable with /=. Numpy expects that when you do that, the array is of the same type as before, but you are dividing with a floating point number which will change the value type.
To solve this issue you can do:
resized_image = resized_image / 255.
and it should work. But you have to note that it will convert the matrix to dtype=float64. To convert it to float32you can do:
resized_image.astype(np.float32)
or
np.float32(resized_image)
The np should come from:
import numpy as np

How to apply tf.map_fn on a sequence feature? Getting an error: TensorArray dtype is string but Op is trying to write dtype uint8

I am writing a sequence to sequence model that maps video to text. I have the frames of the video encoded as JPEG strings in a sequence feature of the SequenceExample proto. When building my input pipeline, I am doing the following to get an array of decoded jpegs:
encoded_video, caption = parse_sequence_example(
serialized_sequence_example,
video_feature="video/frames",
caption_feature="video/caption_ids")
decoded_video = tf.map_fn(lambda x: tf.image.decode_jpeg(x, channels=3), encoded_video)
However, I am getting the following error:
InvalidArgumentError (see above for traceback): TensorArray dtype is string but Op is trying to write dtype uint8.
My goal is to apply image = tf.image.convert_image_dtype(image, dtype=tf.float32) after decoding it to get the pixel values of uint8 between [0,255] to float between [0,1].
I tried to the following:
decoded_video = tf.map_fn(lambda x: tf.image.decode_jpeg(x, channels=3), encoded_video, dtype=tf.uint8)
converted_video = tf.map_fn(lambda x: tf.image.convert_image_dtype(x, dtype=tf.float32), decoded_video)
However, I still get the same error. Anybody has any idea what might be going wrong? Thanks in advance.
Nevermind. Just had to explicitly add a dtype of tf.float32 in the following line:
converted_video = tf.map_fn(lambda x: tf.image.convert_image_dtype(x, dtype=tf.float32), decoded_video, dtype=tf.float32)

List of Tensors when single Tensor expected

I use concat to get tensors as the input of CNN. But got the error: List of Tensors when single Tensor expected
image_raw = img.tobytes()
image = tf.decode_raw(image_raw, tf.uint8)
image = tf.reshape(image, [1, image_height, image_width, 3])
image_val = image
for i in range(batch_size-1):
image_val = tf.concat(0,[image_val,image])
return image_val
I have searched the answers for these question, add
image_val = tf.stack([image_val],0) before return, but still get the same error ,why?
**build environment:**
TensorFlow version 0.12
python 3.5
The error List of Tensors when single Tensor expected comes from the fact you wrote tf.concat(0,[image_val,image]) instead of tf.concat([image_val,image],0).
Maybe check again the type of image_height, image_width because sometimes it is necessary to cast these into an integer dtype, e.g. tf.cast(image_height, tf.int32)