Image Augmentation not altering my images at all - tensorflow

I am trying to implement some image augmentation to my dataset of brain scan MRIs. I'm using the same code from the TensorFlow tutorial page, yet it does not seem to be augmenting my images at all:
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal",
input_shape=(img_height,
img_width,
3)),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
]
)
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
for i in range(9):
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")
Have run it several times but none of the images show any augmentation

Related

How to split mnist dataset into smaller size and adding augmentation to it?

I have this problem of splitting mnist dataset + adding augmentation data. i want to take only total of 22000(including training + test set) data from mnist dataset which is 70000. mnist dataset have 10 label. im only using shear, rotation, width-shift, and heigh-shift for augmetation method.
training set --> 20000(total) --> 20 images + 1980 augmentation images(per label)
test set --> 2000(total) --> 200 images(per label)
i also want to make sure that the class distribution is preserved in the split.
i'm really confused how to split those data. would gladly if anyone can provide the code.
i have tried this code :
# Load the MNIST dataset
(x_train_full, y_train_full), (x_test_full, y_test_full) = keras.datasets.mnist.load_data()
# Normalize the data
x_train_full = x_train_full / 255.0
x_test_full = x_test_full / 255.0
# Create a data generator for data augmentation
data_gen = ImageDataGenerator(shear_range=0.2, rotation_range=20,
width_shift_range=0.2, height_shift_range=0.2)
# Initialize empty lists for the training and test sets
x_train, y_train, x_test, y_test = [], [], [], []
# Loop through each class/label
for class_n in range(10):
# Get the indices of the images for this class
class_indices = np.where(y_train_full == class_n)[0]
# Select 20 images for training
train_indices = np.random.choice(class_indices, 20, replace=False)
# Append the training images and labels to the respective lists
x_train.append(x_train_full[train_indices])
y_train.append(y_train_full[train_indices])
# Select 200 images for test
test_indices = np.random.choice(class_indices, 200, replace=False)
# Append the test images and labels to the respective lists
x_test.append(x_test_full[test_indices])
y_test.append(y_test_full[test_indices])
# Generate 100 augmented images for training
x_augmented = data_gen.flow(x_train_full[train_indices], y_train_full[train_indices], batch_size=100)
# Append the augmented images and labels to the respective lists
x_train.append(x_augmented[0])
y_train.append(x_augmented[1])
# Concatenate the list of images and labels to form the final training and test sets
x_train = np.concatenate(x_train)
y_train = np.concatenate(y_train)
x_test = np.concatenate(x_test)
y_test = np.concatenate(y_test)
print("training set shape: ", x_train.shape)
print("training label shape: ", y_train.shape)
print("test set shape: ", x_test.shape)
print("test label shape: ", y_test.shape)
but it keep saying error like this :
IndexError: index 15753 is out of bounds for axis 0 with size 10000
You are mixing the train and test set. In the loop, you are getting the class_indices from the train set:
# Get the indices of the images for this class
class_indices = np.where(y_train_full == class_n)[0]
but then you are using these train indices (that might be numbers above 10000!) to address indices in the testset (that has only 10000 samples) some lines further down:
# Select 200 images for test
test_indices = np.random.choice(class_indices, 200, replace=False)
So, you will need to do the same index-selection for the label in the loop for the test-set and it should work out.

Data augmentation layer doesn't change the input picture

I am trying to apply data argumentation to increase the amount of training data.
The code is shown below. The augmentation layer consists of RandomFlip and RandomRotation.
def data_augmenter():
'''
Create a Sequential model composed of 2 layers
Returns:
tf.keras.Sequential
'''
### START CODE HERE
data_augmentation = tf.keras.Sequential()
data_augmentation.add((RandomFlip('horizontal')))
data_augmentation.add(RandomRotation(0.2))
### END CODE HERE
return data_augmentation
data_augmentation = data_augmenter()
for image, _ in train_dataset.take(1):
plt.figure(figsize=(10, 10))
first_image = image[0]
plt.imshow(first_image / 255)
plt.figure(figsize=(10, 10))
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
augmented_image = data_augmentation(tf.cast(tf.expand_dims(first_image, 0), tf.float32))
plt.imshow(augmented_image[0] / 255)
plt.axis('off')
Output Images
I had the same issue with my apple silicon Macbook Pro. To make it work, I set the parameter training=True when I passed the augmentation layer.
See the image attached as an example.

Get a sample of one image per class with image_dataset_from_directory

I am trying to visualize Skin Cancer Images using Keras. I have imported the images in my notebook and have created batch datasets using Keras.image_dataset_from_directory. The code is as follows:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=1337,
image_size=image_size,
batch_size=batch_size)
Now, I have been trying to visualize the images. However, I want one image from each class (there are 9 classes in the dataset). I have used the below code:
plt.figure(figsize = (10,10))
for images, labels in train_ds.take(1):
for i in range(9):
ax = plt.subplot(3,3,i+1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")
This code gets me a lot of duplicate classes. How do I get one value for each class (in this case I have 9 classes. I want one plot for each of those 9 classes). I am not sure how to fetch unique images and their labels from a BatchDataset!
for i in range(len(class_names)):
filtered_ds = train_ds.filter(lambda x, l: tf.math.equal(l[0], i))
for image, label in filtered_ds.take(1):
ax = plt.subplot(3, 3, i+1)
plt.imshow(image[0].numpy().astype('uint8'))
plt.title(class_names[label.numpy()[0]])
plt.axis('off')
You could loop through and filter on each label.
Example:
import tensorflow as tf
# fake images
imgs = tf.random.normal([100, 64, 64, 3])
# fake labels
labels = tf.random.uniform([100], minval=0, maxval=10, dtype=tf.int32)
# make dataset
ds = tf.data.Dataset.from_tensor_slices((imgs, labels))
for i in range(9):
filtered = ds.filter(lambda _, l: tf.math.equal(l, i))
for img, label in filtered.take(1):
assert label.numpy() == i
# plot image
Try Following Code - This works perfectly to display one image from each of the 10 categories of cifar10:
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras
(x_train, y_train), (x_test, y_test)= keras.datasets.cifar10.load_data()
fig, ax= plt.subplots(nrows= 2, ncols= 5, figsize= (18,5))
plt.suptitle('displaying one image of each category in train set'.upper(),
y= 1.05, fontsize= 16)
i= 0
for j in range(2):
for k in range(5):
ax[j,k].imshow(x_train[list(y_train).index(i)])
ax[j,k].axis('off')
ax[j,k].set_title(i)
i+=1
plt.tight_layout()
plt.show()

MNIST GAN generators white area in middle surrounded by black

The following code is copied from a GAN MNIST tutorial on UDEMY. When I run the code, it converges towards creating images with a large white area in the center that is black at the sides (picture an empty filled circle against a black background). I have no idea what the problem is as I have only done what the tutorial told me to do word for word. The only difference is that I extract the MNIST data differently. Is there something about tensorflow that has changed recently?
import tensorflow as tf
import numpy as np
import gzip
from PIL import Image
import os.path
def extract_data(filename, num_images):
"""Extract the images into a 4D tensor [image index, y, x, channels].
Values are rescaled from [0, 255] down to [-0.5, 0.5].
"""
print('Extracting', filename)
with gzip.open(filename) as bytestream:
bytestream.read(16)
buf = bytestream.read(28 * 28 * num_images)
data = np.frombuffer(buf, dtype=np.uint8).astype(np.float32)
#data = (data - (PIXEL_DEPTH / 2.0)) / PIXEL_DEPTH
data = data.reshape(num_images, 28, 28, 1)
return data
fname_img_train = extract_data('../Data/MNIST/train-images-idx3-ubyte.gz', 60000)
def generator(z, reuse=None):
with tf.variable_scope('gen',reuse=reuse):
hidden1 = tf.layers.dense(inputs=z,units=128)
alpha = 0.01
hidden1=tf.maximum(alpha*hidden1,hidden1)
hidden2=tf.layers.dense(inputs=hidden1,units=128)
hidden2 = tf.maximum(alpha*hidden2,hidden2)
output=tf.layers.dense(hidden2,units=784, activation=tf.nn.tanh)
return output
def discriminator(X, reuse=None):
with tf.variable_scope('dis',reuse=reuse):
hidden1=tf.layers.dense(inputs=X,units=128)
alpha=0.01
hidden1=tf.maximum(alpha*hidden1,hidden1)
hidden2=tf.layers.dense(inputs=hidden1,units=128)
hidden2=tf.maximum(alpha*hidden2,hidden2)
logits=tf.layers.dense(hidden2,units=1)
output=tf.sigmoid(logits)
return output, logits
real_images=tf.placeholder(tf.float32,shape=[None,784])
z=tf.placeholder(tf.float32,shape=[None,100])
G = generator(z)
D_output_real, D_logits_real = discriminator(real_images)
D_output_fake, D_logits_fake = discriminator(G,reuse=True)
def loss_func(logits_in,labels_in):
return tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
logits=logits_in,labels=labels_in))
D_real_loss = loss_func(D_logits_real,tf.ones_like(D_logits_real)*0.9)
D_fake_loss = loss_func(D_logits_fake,tf.zeros_like(D_logits_real))
D_loss = D_real_loss + D_fake_loss
G_loss = loss_func(D_logits_fake,tf.ones_like(D_logits_fake))
learning_rate = 0.001
tvars = tf.trainable_variables()
d_vars= [var for var in tvars if 'dis' in var.name]
g_vars = [var for var in tvars if 'gen' in var.name]
D_trainer = tf.train.AdamOptimizer(learning_rate).minimize(D_loss,var_list=d_vars)
G_trainer = tf.train.AdamOptimizer(learning_rate).minimize(G_loss,var_list=g_vars)
batch_size=100
epochs=30
set_size=60000
init = tf.global_variables_initializer()
samples=[]
def create_image(img, name):
img = np.reshape(img, (28, 28))
print("before")
print(img)
img = (np.multiply(np.divide(np.add(img, 1.0), 2.0),255.0).astype(np.int16))
print("after")
print(img)
im = Image.fromarray(img.astype('uint8'))
im.save(name)
with tf.Session() as sess:
sess.run(init)
for epoch in range(epochs):
np.random.shuffle(fname_img_train)
num_batches=int(set_size/batch_size)
for i in range(num_batches):
batch = fname_img_train[i*batch_size:((i+1)*batch_size)]
batch_images = np.reshape(batch, (batch_size,784))
batch_images = batch_images*2.0-1.0
batch_z = np.random.uniform(-1,1,size=(batch_size,100))
_ = sess.run(D_trainer, feed_dict={real_images:batch_images,z:batch_z})
_ = sess.run(G_trainer,feed_dict={z:batch_z})
print("ON EPOCH {}".format(epoch))
sample_z = np.random.uniform(-1,1,size=(batch_size,100))
gen_sample = sess.run(G,feed_dict={z:sample_z})
create_image(gen_sample[0], "img"+str(epoch)+".png")
As far as I can see, you are not normalizing the training data. Instead of using your extract_data() function, it is much easier to do the following:
from tensorflow.keras.datasets.mnist import load_data()
(train_data, train_labels), _ = load_data()
train_data /= 255.
Besides, usually people sample twice from the latent space each epoch: once for the discriminator and once for the generator. Still, it did not seem to make a difference.
After implementing these changes, using a batch size of 200 and training for 100 epochs, I got the following result: gen_sample. The result is pretty bad, but it is definitely better than an "empty filled circle against a black background".
Note that the architecture of the generator and of the discriminator that you are using is very simple. From my experience, stacking some convolutional layers gives perfect results. In addition, I would not use the tf.maximum() function, since it creates discontinuities that may negatively impact the flow of the gradients.
Finally, instead of your create_image() function, I used the following:
def plot_mnist(samples, name):
fig = plt.figure(figsize=(6,6))
gs = gridspec.GridSpec(6,6)
gs.update(wspace=0.05, hspace=0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(28,28), cmap='Greys_r')
plt.savefig('{}.png'.format(name))
plt.close()
There are many different ways of improving the quality of a GAN model, and the majority of those techniques can be easily found online. Please let me know if you have any specific question.

Resize MNIST in Tensorflow

I have been working on MNIST dataset to learn how to use Tensorflow and Python for my deep learning course.
I want to resize MNIST as 22 & 22 using tensorflow, then I train it, but I do not how to do?
Could you help me?
TheRevanchist's answer is correct. However, for the mnist dataset, you first need to reshape the mnist array before you send it to tf.image.resize_images():
import tensorflow as tf
import numpy as np
import cv2
mnist = tf.contrib.learn.datasets.load_dataset("mnist")
batch = mnist.train.next_batch(10)
X_batch = batch[0]
batch_tensor = tf.reshape(X_batch, [10, 28, 28, 1])
resized_images = tf.image.resize_images(batch_tensor, [22,22])
The code above takes out a batch of 10 mnist images and reshapes them from 28x28 images to 22x22 tensorflow images.
If you want to display the images, you can use opencv and the code below. The resized_images.eval() converts the tensorflow image to a numpy array!
with tf.Session() as sess:
numpy_imgs = resized_images.eval(session=sess) # mnist images converted to numpy array
for i in range(10):
cv2.namedWindow('Resized image #%d' % i, cv2.WINDOW_NORMAL)
cv2.imshow('Resized image #%d' % i, numpy_imgs[i])
cv2.waitKey(0)
Did you try tf.image.resize_image?
The method:
resize_images(images, size, method=ResizeMethod.BILINEAR,
align_corners=False)
where images is a batch of images, and size is a vector tensor which determines the new height and width. You can look at the full documentation here: https://www.tensorflow.org/api_docs/python/tf/image/resize_images
Updated: TensorFlow 2.4.1
Short Answer
Use tf.image.resize (instead of resize_images). The link other provided no longer exits. Updated link.
Long Answer
MNIST in tf.keras.datasets.mnist is the following shape
(batch_size, 28 , 28)
Here is the full implementation. Please read the comment which attach with the code.
(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
# expand new axis, channel axis
x_train = np.expand_dims(x_train, axis=-1)
# [optional]: we may need 3 channel (instead of 1)
x_train = np.repeat(x_train, 3, axis=-1)
# it's always better to normalize
x_train = x_train.astype('float32') / 255
# resize the input shape , i.e. old shape: 28, new shape: 32
x_train = tf.image.resize(x_train, [32,32]) # if we want to resize
print(x_train.shape)
# (60000, 32, 32, 3)
You can use cv2.resize() function of opencv
Use a for loop to go iterate through every image
And inside for loop for every image add this line cv2.resize(source_image, (22, 22))
def resize(mnist):
train_data = []
for img in mnist.train._images:
resized_img = cv2.resize(img, (22, 22))
train_data.append(resized_img)
return train_data