Augmentation layers only for specific classes - tensorflow

For the classification task, I would like to apply Augmentation Layers (augmentation), only for specific classes.
The rationale: In case of an unbalanced dataset I would like to improve model performance for classes with a small number of images.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/RandomFlip

One solution would be to use tf.data.Dataset() and a .map() function where you could verify the label in it, therefore performing augmentation only on that specific label.
def process_function(filepath):
label = retrieve_class_label(filepath=filepath)
image = tf.io.read_file(filename=filepath)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.cast(image, tf.float32) / 255.0
image = tf.image.resize(image, [IMAGE_WIDTH, IMAGE_HEIGHT])
# Presume the label is of the form "root_folder/class_name"
label = tf.strings.split(filepath, '\\')[-1]
return image, label
def augment_function(image, label):
if tf.math.equal(class_underrepresented, label):
image = tf.image.random_flip_left_right(image)
return image, label

Related

image preprocess function for image_dataset_from_directory

In the ImageDataGenerator, I've used the following function to preprocess images, through the keyword of 'preprocessing' in .flow_from_dataframe().
However, I am now trying to use the image_dataset_from_directory, which does not work with the preprocess function, as it does not allow embedding this function.
I've tried to apply the preprocess_image() function after the dataset is generated by image_dataset_from_directory, through .map() function, but it does not work either.
Please could anyone advise?
Many thanks,
Tony
train_Gen = dataGen.flow_from_dataframe(
df,
x_col='id_code',
y_col='diagnosis',
directory=os.path.join(data_dir, 'train_images'),
batch_size=BATCH_SIZE,
target_size=(IMG_WIDTH, IMG_HEIGHT),
subset='training',
seed=123,
class_mode='categorical',
**preprocessing=preprocess_image**,
)
def crop_image_from_gray(img, tol=7):
"""
Applies masks to the orignal image and
returns the a preprocessed image with
3 channels
:param img: A NumPy Array that will be cropped
:param tol: The tolerance used for masking
:return: A NumPy array containing the cropped image
"""
# If for some reason we only have two channels
if img.ndim == 2:
mask = img > tol
return img[np.ix_(mask.any(1),mask.any(0))]
# If we have a normal RGB images
elif img.ndim == 3:
gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
mask = gray_img > tol
check_shape = img[:,:,0][np.ix_(mask.any(1),mask.any(0))].shape[0]
if (check_shape == 0): # image is too dark so that we crop out everything,
return img # return original image
else:
img1=img[:,:,0][np.ix_(mask.any(1),mask.any(0))]
img2=img[:,:,1][np.ix_(mask.any(1),mask.any(0))]
img3=img[:,:,2][np.ix_(mask.any(1),mask.any(0))]
img = np.stack([img1,img2,img3],axis=-1)
return img
def preprocess_image(image, sigmaX=10):
"""
The whole preprocessing pipeline:
1. Read in image
2. Apply masks
3. Resize image to desired size
4. Add Gaussian noise to increase Robustness
:param img: A NumPy Array that will be cropped
:param sigmaX: Value used for add GaussianBlur to the image
:return: A NumPy array containing the preprocessed image
"""
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = crop_image_from_gray(image)
image = cv2.resize(image, (IMG_WIDTH, IMG_HEIGHT))
image = cv2.addWeighted (image,4, cv2.GaussianBlur(image, (0,0) ,sigmaX), -4, 128)
return image

Rotating image and its key points label in tensorflow2.0

I am trying to add rotation to my dataset of images where the labels have some facial keypoints. tf.contrib is removed from tensorflow 2.0 and any other library like PIL does not work as I am using tf.data.Dataset.
I need angle rotated to be random while the same rotation needs to be applied to both an image and its keypoint labels as well. Is there a way to do this in tensorflow 2.0?
Below is the function I used:
def preprocess_data(image, angle):
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [input_size, input_size])
image = tf.image.rgb_to_grayscale(image)
image = Image.fromarray(np.array(tf.squeeze(image)))
rotated = Image.Image.rotate(image, angle)
image = tf.convert_to_tensor(np.array(rotated))
image = tf.expand_dims(image, -1)
return image
def load_and_preprocess_data(path):
image = tf.io.read_file(path)
rotation = tf.random.uniform([1,1], minval=-60, maxval=60, seed=0)
return preprocess_data(image, rotation)
Here I used PIL but it is not working when I try to map a tf.data.Dataset containing image paths to load_and_preprocess_data function.

Read different size and format images to form a queue in Tensorflow

I meet a problem on for Tensorflow. I want to read some bmp and jpeg images to form the queue in Tensorflow. And these images have different size.
The input is image path list and label list.
Currently I use " tf.train.slice_input_producer" (generate queue), "tf.image.decode_image" (read different format image), "tf.image.resize_images" (resize image to same size).
However, here I have some problems. The "tf.image.resize_images" needs image shape but there is no shape from "tf.image.decode_image". If I set fixed image shape manually, there will be error to read some images with different size.
Is there any better way for this issue (read different size and format images in Tensorflow)?
images = tf.convert_to_tensor(image_list)
labels = tf.convert_to_tensor(label_list)
input_queue = tf.train.slice_input_producer([images, labels]) #Slice_input producer shuffles the data by default.
#Decode
image = tf.read_file(input_queue[0])
image = tf.image.decode_image(image, channels=3) # for different format
label = input_queue[1]
image.set_shape([640, 480, 3]) # if I dont set the shape, "tf.image.resize_images" cannot work, if I set it, it is fixed...
image = tf.image.resize_images(image, [160, 120])
image_batch, label_batch = tf.train.batch([image , label], batch_size=batch_size)
return image_batch, label_batch

How to use feed_dict in slim.learning.train of tensorflow

I read an example in tf-slim-mnist, and read one or two answers in Google, but all of them feed data to an 'images' tensor and a 'labels' tensor from an already filled-up tenser of data. For example, in tf-slim-mnist,
# load batch of dataset
images, labels = load_batch(
dataset,
FLAGS.batch_size,
is_training=True)
def load_batch(dataset, batch_size=32, height=28, width=28, is_training=False):
data_provider = slim.dataset_data_provider.DatasetDataProvider(dataset)
image, label = data_provider.get(['image', 'label'])
image = lenet_preprocessing.preprocess_image(
image,
height,
width,
is_training)
images, labels = tf.train.batch(
[image, label],
batch_size=batch_size,
allow_smaller_final_batch=True)
return images, labels
Another example, in tensorflow github issues #5987,
graph = tf.Graph()
with graph.as_default():
image, label = input('train', FLAGS.dataset_dir)
images, labels = tf.train.shuffle_batch([image, label], batch_size=FLAGS.batch_size, capacity=1000 + 3 * FLAGS.batch_size, min_after_dequeue=1000)
images_validation, labels_validation = inputs('validation', FLAGS.dataset_dir, 5000)
images_test, labels_test = inputs('test', FLAGS.dataset_dir, 10000)
Because my data is of variable size, it is hard to fill up a tensor of data beforehand.
Is there any way to use feed_dict with slim.learning.train()? Is it a proper way to add feed_dict as an argument to the train_step_fn()? If yes, how? Thanks.
I think feed_dict is not a good way when input data size varies and hard to fill in memory.
Convert your data into tfrecords is a more proper way. Here is the example of convert data. You can deal with the data by TFRecordReader and parse_example to deal with output file.

Is the input of `tf.image.resize_images` must have static shape?

I run the code below, it raises an ValueError: 'images' contains no shape. Therefore I have to add the line behind # to set the static shape, but img_raw may have different shapes and this line makes the tf.image.resize_images out of effect.
I just want to turn images with different shapes to [227,227,3]. How should I do that?
def tf_read(file_queue):
reader = tf.WholeFileReader()
file_name, content = reader.read(file_queue)
img_raw = tf.image.decode_image(content,3)
# img_raw.set_shape([227,227,3])
img_resized = tf.image.resize_images(img_raw,[227,227])
img_shape = tf.shape(img_resized)
return file_name, img_resized,img_shape
The issue here actually comes from the fact that tf.image.decode_image doesn't return the shape of the image. This was explained in these two GitHub issues: issue1, issue2.
The problem comes from the fact that tf.image.decode_image also handles .gif, which returns a 4D tensor, whereas .jpg and .png return 3D images. Therefore, the correct shape cannot be returned.
The solution is to simply use tf.image.decode_jpeg or tf.image.decode_png (both work the same and can be used on .png and .jpg images).
def _decode_image(filename):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
image = tf.cast(image_decoded, tf.float32)
image_resized = tf.image.resize_images(image, [224, 224])
return image_resized
No, tf.image.resize_images can handle dynamic shape
file_queue = tf.train.string_input_producer(['./dog1.jpg'])
# shape of dog1.jpg is (720, 720)
reader = tf.WholeFileReader()
file_name, content = reader.read(file_queue)
img_raw = tf.image.decode_jpeg(content, 3) # size (?, ?, 3) <= dynamic h and w
# img_raw.set_shape([227,227,3])
img_resized = tf.image.resize_images(img_raw, [227, 227])
img_shape = tf.shape(img_resized)
with tf.Session() as sess:
print img_shape.eval() #[227, 227, 3]
BTW, I am using tf v0.12, and there is no function called tf.image.decode_image, but I don't think it is important
Of course you can use tensor object as size input for tf.image.resize_images.
So, by saying "turn images with different shapes to [227,227,3]", I suppose you don't want to lose their aspect ratio, right? To achieve this, you have to rescale the input image first, then pad the rest with zero.
It should be noted, though, you should consider perform image distortion and standardization before padding it.
# Rescale so that one side of image can fit one side of the box size, then padding the rest with zeros.
# target height is 227
# target width is 227
image = a_image_tensor_you_read
shape = tf.shape(image)
img_h = shape[0]
img_w = shape[1]
box_h = tf.convert_to_tensor(target_height)
box_w = tf.convert_to_tensor(target_width)
img_ratio = tf.cast(tf.divide(img_h, img_w), tf.float32)
aim_ratio = tf.convert_to_tensor(box_h / box_w, tf.float32)
aim_h, aim_w = tf.cond(tf.greater(img_ratio, aim_ratio),
lambda: (box_h,
tf.cast(img_h / box_h * img_w, tf.int32)),
lambda: (tf.cast(img_w / box_w * img_h, tf.int32),
box_w))
image_resize = tf.image.resize_images(image, tf.cast([aim_h, aim_w], tf.int32), align_corners=True)
# Perform image standardization and distortion
image_standardized_distorted = blablabla
image_padded = tf.image.resize_image_with_crop_or_pad(image_standardized_distorted, box_h, box_w)
return image_padded