How to make sure the prediction time image input is in the same range as the training time image input? - tensorflow

This question is about ensuring the prediction time input images to be in the same range as the images fed during the training time. I know that it's the usual practice to repeat the same steps that were done during the training time to process an image at the prediction time. But in my case, I apply random_trasnform() function inside a custom data generator during the training time, which won't make sense to add during the prediction time.
import cv2
import tensorflow as tf
import seaborn as sns
To simplify my problem, assume I'm doing the following changes to a grayscale image that I read in a custom data generator.
img_1 is an output of the data generator, that is supposed to be the input to a VGG19 model.
# using a simple augmenter
augmenter = tf.keras.preprocessing.image.ImageDataGenerator(
brightness_range=(0.75, 1.25),
preprocessing_function=tf.keras.applications.vgg19.preprocess_input # preprocessing function of VGG19
)
# read the image
img = cv2.imread('sphx_glr_plot_camera_001.png')
# add a random trasnform
img_1 = augmenter.random_transform(img)/255
The above random_tranform() has made the grayscale value distribution to be as follows (between [0,1]):
plt.imshow(img_1); plt.show();
sns.histplot(img_1[:, :, 0].ravel()); # select the 0th layer and ravel because the augmenter stacks 3 layers of the grayscale image to make it an RGB image
Now, I want to do the same in the prediction time, but, I don't want a random transform applied to the image so I just pass the input image through the preprocessing_function().
# read image
img = cv2.imread('sphx_glr_plot_camera_001.png')
# pass through the preprocessing function
img_2 = tf.keras.applications.vgg19.preprocess_input(img)/255
But I'm unable to make the input to be in the range of the [0, 1] as was done during the training.
plt.imshow(img_2); plt.show();
sns.histplot(img_2[:, :, 0].ravel());
This makes the predictions completely incorrect. How can I make sure that the inputs to the model at the prediction time undergo the same steps so that they end up having a similar distribution to the inputs that were fed during training? I don't want to add a random_transform() at the prediction time as well.

I will recommend to add an per image standardization in your model this will ensure you that the mean of the image is 0 and standard deviation is 1 in you training set and in your inference

Related

tensor slicing in tensorflow

I want to do the same numpy operation as follow to make a custom layer
img=cv2.imread('img.jpg') # img.shape =>(600,600,3)
mask=np.random.randint(0,2,size=img.shape[:2],dtype='bool')
img2=np.expand_dims(img,axis=0) #img.shape => (1,600,600,3)
img2[:,mask,:].shape # => (1, 204030, 3)
this is my first attemp but I failed. I can't do the same operation for for tensorflow tensors
class Sampling_layer(keras.layers.Layer):
def __init__(self,sampling_matrix):
super(Sampling_layer,self).__init__()
self.sampling_matrix=sampling_matrix
def call(self,input_img):
return input_img[:,self.sampling_matrix,:]
More Explanations:
I want to define a keras layer so that given a batch of images it use a sampling matrix and give me a batch of sampled vectors for the images.The sampling matrix is a random boolean matrix the same size as the image. The slicing operation I used is straight forward for numpy arrays and works perfectly. but I can't get it done with tensors in tensorflow. I tried to use loops to perform the operation I want manually but I failed.
You can do the following.
import numpy as np
import tensorflow as tf
# Batch of images
img=np.random.normal(size=[2,600,600,3]) # img.shape =>(600,600,3)
# You'll need to match the first 3 dimensions of mask with the img
# for that we'll repeat the first axis twice
mask=np.random.randint(0,2,size=img.shape[1:3],dtype='bool')
mask = np.repeat(np.expand_dims(mask, axis=0), 2, axis=0)
# Defining input layers
inp1 = tf.keras.layers.Input(shape=(600,600,3))
mask_inp = tf.keras.layers.Input(shape=(600,600))
# The layer you're looking for
out = tf.keras.layers.Lambda(lambda x: tf.boolean_mask(x[0], x[1]) )([inp1, mask])
model = tf.keras.models.Model([inp1, mask_inp], out)
# Predict on sample data
toy_out = model.predict([img, mask])
Note that both your images and mask needs to have the same batch size. I couldn't find a solution to make this work without repeating the mask on batch axis to match the batch size of images. This is the only possible solution that came to my mind, (assuming that your mask changes for every batch of data).

Tensorflow taking input in the same order as input

I am using tensorflow to test my trained model on test images. I am feeding the images to tensorflow as below:
image_ab, image_aba = sess.run(fetches, feed_dict={self.image_a: image_a,
self.is_train: False})
I printed the image_a and image_ab and observed that image_a is not in the same order as the input images i give.
For some reasons i want the output also to be in the same order as input images.
Does tensorflow usually takes input in the same order as the input given?
I assume you mean image_ab is not in the same order. Because image_a is the input that you feed to tensorflow. If this input is not ordered correctly, it will be your preprocessing, not tensorflow.
Tensorflow usually works on batches of data. For images, the convention for batch dimensions is:
[batch, x, y, colors]
The operations that tensorflow performs are parallelized along the batch. If you simply plug convolutional layers together, the order of the batch should be preserved.
However, it is surely possible to reorder things in tensorflow:
import numpy as np
import tensorflow as tf
x = tf.placeholder(shape=(2,1), dtype="float32")
y = tf.concat([x[1], x[0]], axis=0)
sess = tf.Session()
sess.run([x,y], feed_dict={x:np.random.rand(2,1)})
This code will read in x, change the order of its entries and produce y.
So tensorflow can reorder your images. You could search your code for a pattern like the one in my example.

Tensorflow VGG16 (converted from caffe) got low evaluation accuracy

I didn't convert the weights by myself, instead I used vgg16_weights.npz from www(dot)cs(dot)toronto(dot)edu/~frossard/post/vgg16/. There, it is mentioned
We convert the Caffe weights publicly available in the author’s GitHub profile (gist(dot)github(dot)com/ksimonyan/211839e770f7b538e2d8#file-readme-md) using a specialized tool (github(dot)com/ethereon/caffe-tensorflow).
But, in that page, there is no validation code, so I made it referring to tensorflow MNIST and inception code.
How I create TFRecords of Imagenet
I use build_imagenet_data.py from inception. I changed the
label_index = 0 #originally label_index = 1
because inception use label_index 0 as background class (so in total there are 1001 classes). Caffe format doesn't use that as the number of output is 1000. I prefer to use TFRecord format as I will change process the weight and retrain.
How I load the weights
inference function taken from MNIST's mnist.py was modified so the Variable is taken from the vgg16_weights.npz
How I load the weights:
weights = np.load('/the_path/vgg16_weights.npz')
How I put the variable in conv1_1:
with tf.name_scope('conv1_1') as scope:
kernel = tf.Variable(tf.constant(weights['conv1_1_W']), name='weights')
conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')
biases = tf.Variable(tf.constant(weights['conv1_1_b']), name='biases')
out = tf.nn.bias_add(conv, biases)
conv1_1 = tf.nn.relu(out, name=scope)
sess.run(conv1_1)
How I read the TFRecords
I took inception's image_processing.py, dataset.py, and ImagenetData.py with no change. Then, I run inception's inception_eval.py evaluate function with changing in inference code and deleting the restoring moving variable from checkpoint (as I already restore manually in variable initialization). However, the accuracy is not same with the VGG-16 in caffe. Top-5 accuracy is around 9%.
Closing
What is the problem of this method? There are several part of code that I still don't understand though:
How TFReader move to the next batch of images after processing 1 batch of images? The output of inception's image_processing.py size is only the number of batch size. To be complete, this is the output based on documentation:
images: Images. 4D tensor of size [batch_size, FLAGS.image_size,
image_size, 3].
labels: 1-D integer Tensor of [FLAGS.batch_size].
Do I need softmax the logits before tf.in_top_k ? (Well, I don't think it is matter as the value sequence is same)
Thank you for the help. Sorry if the link is messy as I can only post 2 links in 1 post because of my reputation.
UPDATE
I tried myself by changing the caffe weight. Reverse the channel input dimension of conv1_1 (because caffe receive BGR, so the weight is for BGR instead of RGB in tensorflow) and get the same accuracy with the weight from website: around 9% in top-5.
I found out that there is no mean image subtraction in tensorflow inception's image_processing.py. I add mean subtraction (in eval_image function) with tf.reduce_mean and got 11% accuracy.
Then I tried to change the eval_image function with
# source: https://github.com/ethereon/caffe-tensorflow/blob/master/examples/imagenet/dataset.py
img_shape = tf.to_float(tf.shape(image)[:2])
min_length = tf.minimum(img_shape[0], img_shape[1])
new_shape = tf.to_int32((256 / min_length) * img_shape) #isotropic case
# new_shape = tf.pack([256,256]) #non isotropic case
image = tf.image.resize_images(image, [new_shape[0], new_shape[1]])
offset = tf.to_int32((new_shape - 224) / 2)
image = tf.slice(image, begin=tf.pack([offset[0], offset[1], 0]), size=tf.pack([224, 224, -1]))
# mean_subs_image = tf.reduce_mean(image,axis=[0,1],keep_dims=True)
return image - mean_subs_image
and I got 13%. Increased but still lack a lot. Seems it is one of the problem. I am not sure what is the other problems.
In general porting whole model weights across libraries will be hard. You pointed out some differences from caffe, but there could be others. It might be easier to retrain the model in TensorFlow.

Feeding individual examples into TensorFlow graph trained on files?

I'm new to TensorFlow and am getting a bit tripped up on the mechanics of reading data. I set up a TensorFlow graph on the mnist data, but I'd like to modify it so that I can run one program to train it + save the model out, and run another to load said graph, make predictions, and compute test accuracy.
Where I'm getting confused is how to bypass the original I/O system in the training graph and "inject" an image to predict or an (image, label) tuple of test data for accuracy testing. To read the training data, I'm using this code:
_, input_data = util.read_examples(
paths_to_files,
batch_size,
shuffle=shuffle,
num_epochs=None)
feature_map = {
'label': tf.FixedLenFeature(
shape=[], dtype=tf.int64, default_value=[-1]),
'image': tf.FixedLenFeature(
shape=[NUM_PIXELS * NUM_PIXELS], dtype=tf.int64),
}
example = tf.parse_example(input_data, features=feature_map)
I then feed example to a convolution layer, etc. and generate the output.
Now imagine that I train my graph with that code specifying the input, save out the graph and weights, and then restore the graph and weights in another script for prediction -- I'd like to take (say) 10 images and feed them to the graph to generate predictions. How do I "inject" those 10 images so that the predictions come out the other end?
I played around with feed dictionaries and placeholders, but I'm not sure if they're the right things for me to use... it seems like they rely on having data in memory, as opposed to reading from a queue of test data, for example.
Thanks!
A feed dictionary with placeholders would make sense if you wanted to perform a small number of inferences/evaluations (i.e. enough to fit in memory) - e.g. if you were serving a simple model or running small eval loops.
If you specifically want to infer or evaluate large batches then you should use the same approach you've used for training, but with a different path to your test/eval/live data. e.g.
_, eval_data = util.read_examples(
paths_to_files, # CHANGE THIS BIT
batch_size,
shuffle=shuffle,
num_epochs=None)
You can use this as a normal python variable and set up successive, dependent steps to use this as a provided variable. e.g.
def get_example(data):
return tf.parse_example(data, features=feature_map)
sess.run([get_example(path_to_your_data)])

Tensorflow slim how to specify batch size during training

I'm trying to use slim interface to create and train a convolutional neural network, but I couldn't figure out how to specify the batch size for training.
During the training my net crashes because of "Out of Memory" on my graphic card.
So I think that should be a way to handle this condition...
Do I have to split the data and the labels in batches and then explicitly loop or the slim.learning.train is taking care of it?
In the code I paste train_data are all the data in my training set (numpy array)..and the model definition is not included here
I had a quick loop to the sources but no luck so far...
g = tf.Graph()
with g.as_default():
# Set up the data loading:
images = train_data
labels = tf.contrib.layers.one_hot_encoding(labels=train_labels, num_classes=num_classes)
# Define the model:
predictions = model7_2(images, num_classes, is_training=True)
# Specify the loss function:
slim.losses.softmax_cross_entropy(predictions, labels)
total_loss = slim.losses.get_total_loss()
tf.scalar_summary('losses/total loss', total_loss)
# Specify the optimization scheme:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)
train_tensor = slim.learning.create_train_op(total_loss, optimizer)
slim.learning.train(train_tensor,
train_log_dir,
number_of_steps=1000,
save_summaries_secs=300,
save_interval_secs=600)
Any hints suggestions?
Edit:
I re-read the documentation...and I found this example
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)
But It's not clear at all how to feed image and label to be passed to tf.train.batch... as MyPascalVocDataLoader function is not specified...
In my case my data set are loaded from a sqlite database and I have training data and labels as numpy array....still confused.
Of course I tried to pass my numpy arrays (converted to constant tensor) to the tf.train.batch like this
image = tf.constant(train_data)
label = tf.contrib.layers.one_hot_encoding(labels=train_labels, num_classes=num_classes)
images, labels = tf.train.batch([image, label], batch_size=32)
But seems not the right path to follow... it seems that the train.batch wants only one element from my data set...(how to pass this? it does not make sense to me to pass only train_data[0] and train_labels[0])
Here you can create the tfrecords which is the special type of binary file format used by the tensorflow. As you mentioned you have the training images and the labels, you can easily create the TFrecords for training and validation.
After creating the TFrecords, all you need to right is decode the images from the encoded TFrecords and give it to your model input. There you can select the batch size and all.