How to test your model on Retinanet? - tensorflow

I am using the Retinanet model to train a classifier with about 50 classes. Link to the model: https://github.com/fizyr/keras-retinanet
This is what I have done so far:
Installed the model using the suggested steps.
Create a csv of my images with the recommended format for reading.
Used the following script to train my model:
# Using the installed script:
retinanet-train csv <path to csv file containing annotations> <path to csv file containing classes>
The model is currently running and training with about 50 epochs and 10000 steps in each epoch. I see the losses going down and it should take about a day to finish the training.
How do I proceed now with:
a. Testing my model? The example given here:
An example of testing the network can be seen in this (https://github.com/fizyr/keras-retinanet/blob/master/examples/ResNet50RetinaNet.ipynb link on the website is dead, this seems appropriate) Notebook. In general, output can be retrieved from the network as follows:
_, _, detections = model.predict_on_batch(inputs)
Where detections are the resulting detections, shaped (None, None, 4 + num_classes) (for (x1, y1, x2, y2, cls1, cls2, ...)).
Loading models can be done in the following manner:
from keras_retinanet.models.resnet import custom_objects
model = keras.models.load_model('/path/to/model.h5',
custom_objects=custom_objects)
Execution time on NVIDIA Pascal Titan X is roughly 55msec for an image of shape 1000x600x3.
Now during the training, I did not do anything while running my model:
Create generators for training and testing data (an example is show in keras_retinanet.preprocessing.PascalVocGenerator).
Am I missing something?
Again, sorry for the multi-fold questions and thank you for helping me out.

If by testing you mean running your own image through the network, have a look at the new example. All it does is setup the environment, load in the model, load and prepare an image and visualize the results.
https://github.com/fizyr/keras-retinanet/blob/master/examples/ResNet50RetinaNet.ipynb
Is there an issue with that example? Or is it not clear?

Related

Feed image data without class label

I am trying to implement image super resolution using SRGAN. In the process, I used DIV2K dataset (http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip) as my source.
I have worked with image classification using CNN (I used keras.layers.convolutional.Conv2D). But in this case we don't have class label in my data source.
I have unzipped the file and kept in D:\Unzipped\DIV2K_train_HR. Then used following command to read the files.
img_dataset = tensorflow.keras.utils.image_dataset_from_directory("D:\\unzipped")
Then created the model as follows
model = Sequential()
model.add(Conv2D(filters=64,kernel_size=(3,3),activation="relu",input_shape=(256,256,3)))
model.add(AveragePooling2D(pool_size=(2,2)))
model.add(Conv2D(filters=64,kernel_size=(3,3),activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.compile(optimizer='sgd', loss='mse')
model.fit(img_dataset,batch_size=32, epochs=10)
But I am getting error : "Graph execution error". I am unable to find the root cause behind this error. Is this error appearing as the class label is missing (I think as per code DIV2K_train_HR is treated as one class label)? Or is this happening due to images don't have one specific size?
Note: This code does not match with SRGAN architecture. I am new to GAN and trying to move ahead step by step. I got stuck in the first step itself.
Yes, the error message is because you don't have labels in your dataset.
As a first step in GAN network you need to create a discriminator model: given some image it should recognize if it is a real or fake image. You can take images from your dataset and label them as 1 ("real images"). Then generate "fake images" by down-sampling and up-sampling images from your dataset and label them as 0. Train your discriminator model so that it can distinguish between original and processed images.
After that, you create generator model. The generator model takes a down-sampled version of the image as an input and creates an up-sampled version in original resolution. GAN model combines generator and discriminator models by passing output from generator to discriminator. The target label is 1, i.e. we want generator create up-sampled versions of images, which discriminator can't distinguish from the real ones. Now train GAN network (set 'trainable' to false for discriminator model weights).
After your generator manages to produce images, which discriminator can't distinguish from the real, you take them, label as 0 and train discriminator again. Then train generator again etc.
The process continues until discriminator can't distinguish fake images from the real ones anymore (i.e. accuracy doesn't exceed 0.5).
Please see a simple example on ("Generative Adversarial Networks"):
https://github.com/ageron/handson-ml3/blob/main/17_autoencoders_gans_and_diffusion_models.ipynb
This code is explained in ch. 17 in book "Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow (3rd edition)" by Aurélien Géron.

False prediction from efficientnet transfer learning

I'm new to transfer learning in TensorFlow and I choose tfhub to simplify finding a dataset, but now I'm confused because my model gives me a wrong prediction when I try to use an image from the internet. I used the efficientnet_v2_imagenet1k_b0 feature vector without fine-tuning to train a rock-paper-scissors dataset from https://www.kaggle.com/drgfreeman/rockpaperscissors. I used image data generator and flow from directory for data processing.
This is my model here
This is my train result here
This is my test result here
It's the second time I get something like this when using transfer learning with tfhub. I want to know why this happened and how to fix it, so this problem doesn't happen again. Thanks a lot for your help and sorry for my bad English.
I downloaded your code to my local machine and the dataset as well.
Had to make a few adjustments to make it run locally.
I believe the model efficientnet_v2_imagenet1k_b0 is different
from the newer efficient net models in that this version DOES
require pixel levels to be scaled between 0 and 1. I ran the model
with and without rescaling and it works well only if the pixlels
are rescaled. Below is the code I used to test if the model correctly predicts
an image downloaded from the internet. It worked as expected.
import cv2
class_dict=train_generator.class_indices
print (class_dict)
rev_dict={}
for key, value in class_dict.items():
rev_dict[value]=key
print (rev_dict)
fpath=r'C:\Temp\rps\1.jpg' # an image downloaded from internet that should be paper class
img=plt.imread(fpath)
print (img.shape)
img=cv2.resize(img, (224,224)) # resize to 224 X 224 to be same size as model was trained on
print (img.shape)
plt.imshow(img)
img=img/255.0 # rescale as was done with training images
img=np.expand_dims(img,axis=0)
print(img.shape)
p=model.predict(img)
print (p)
index=np.argmax(p)
print (index)
klass=rev_dict[index]
prob=p[0][index]* 100
print (f'image is of class {klass}, with probability of {prob:6.2f}')
the results were
{'paper': 0, 'rock': 1, 'scissors': 2}
{0: 'paper', 1: 'rock', 2: 'scissors'}
(300, 300, 3)
(224, 224, 3)
(1, 224, 224, 3)
[[9.9902594e-01 5.5121275e-04 4.2284720e-04]]
0
image is of class paper, with probability of 99.90
You had this in your code
uploaded = files.upload()
len_file = len(uploaded.keys())
This did not run because files was not defined
so could not find what causes your misclassification problem.
Remember in flow_from_directory, if you do not specify the color mode it defaults to rgb. So even though training images are 4 channel PNG the
actual model is trained on 3 channels. So make sure the images you want to predict are 3 channels.
To help really need to see the code for how you provide your data to model.predict. However as a guess, remember efficientnet needs to have the pixels in the range from0 to 255 so do not scale your images. Make sure your test images are rgb an of the same size as the image size used in training. Also need to see code for how you process the predictions

Tensorflow image classification example

This is my first time doing image classification, I followed this tutorial:
https://www.tensorflow.org/tutorials/images/classification
I'm wondering, how do I take that model, and actually use it to make predictions?
I would just to put one image into the model, and would ideally like to get a prediction % of whether it thinks its a dog or a cat.
I saved the model using:
model.save(my_model.h5)
But am really lost at the next steps.
There's another Tensorflow tutorial which uses model.predict() specifically: Basic classification: Classify images of clothing
Not sure if my code is correct all the way but I tried to extend the prediction part of the cats/dogs tutorial using model.predict_generator() though I can't seem to entirely understand the results I get. Adapted code from this second tutorial: Tutorial on using Keras flow_from_directory and generators
# Preparing the testing dataset
test_dir = os.path.join(os.getcwd(), 'cat_dog_testing') # directory with test images
test_image_generator = ImageDataGenerator(rescale=1./255) # rescaling pixels 0 to 1
test_generator = test_image_generator.flow_from_directory(batch_size=6,
directory=test_dir,
shuffle=False,
target_size=(IMG_HEIGHT,IMG_WIDTH),
class_mode=None)
STEP_SIZE_TEST=test_generator.n//test_generator.batch_size
test_generator.reset()
pred=model_new.predict_generator(test_generator, steps=STEP_SIZE_TEST, verbose=1)
I built a tensorflow image classification workflow so that you can both train and classify images with no code. It's on FlyteHub if you want to see it
https://flytehub.org/trainandclassifyimages
Happy to collaborate if you have improvements you want to make to the codebase :)

Neural Network with my own dataset

I have downloaded many face images from web. In order to learn Tensorflow I want to feed those images to a simple fully-connected neural network with a single hidden layer. I have found an example code in here.
Since I am a beginner, I don't know how to train, evaluate, and test the network with the downloaded images. The code owner used a '.mat' file and a .pkl file. I don't understand how he organized training and test set.
In order to run the code with my images;
Do I need to divide my images into training, test, and validation folders and turn each folder into a mat file? How am I going to provide labels for the training?
Besides, I don't understand why he used a '.pkl' file?
All in all, I would like to change this code so that I can find test, training , and validation set classification performance with my image dataset.
It might be an easy question, but it is important for me as it is a starting step. Thanks for your understanding.
First, you don't have to use .mat files nor pickles. Tensorflow expects numpy array.
For instance, let's say you have 70000 images of size 28x28 (=784 dimensions) belonging to 10 classes. Let's also assume that you'd like to train a simple feedforward neural network to classify the images.
The first step would be to split the images between train and test (and validation, but let's put this aside for the sake of simplicity). For the sake of the example, let's imagine that you chose randomly 60000 images for your training set and 10000 for your test set.
The second step would be to ensure that your data has the right format. Here, you'd like your training set to consist in one numpy array of shape (60000, 784) for the images and another one of shape (60000, 10) for the labels (if you use one-hot encoding to represent your classes). As for your test set, you should have an array of shape (10000, 784) for the images and one of shape (10000, 10) for the labels.
Once you have these big numpy arrays, you should define placeholders that will allow you to feed data to you network during training and evaluation.
images = tf.placeholder(tf.float32, shape=[None, 784])
labels = tf.placeholder(tf.int64, shape=[None, 10])
The None here means that you can feed a batch of any size, i.e. as many images as you want, as long as you numpy array is of shape (anything, 784).
The third step consists in defining your model as well as the loss function and the optimizer.
The fourth step consists in training your network by feeding it with random batches of data using the placeholders created above. As your network is training, you can periodically print its performance like the training loss/accuracy as well as the test loss/accuracy.
You can find a complete and very simple example here.

Feeding individual examples into TensorFlow graph trained on files?

I'm new to TensorFlow and am getting a bit tripped up on the mechanics of reading data. I set up a TensorFlow graph on the mnist data, but I'd like to modify it so that I can run one program to train it + save the model out, and run another to load said graph, make predictions, and compute test accuracy.
Where I'm getting confused is how to bypass the original I/O system in the training graph and "inject" an image to predict or an (image, label) tuple of test data for accuracy testing. To read the training data, I'm using this code:
_, input_data = util.read_examples(
paths_to_files,
batch_size,
shuffle=shuffle,
num_epochs=None)
feature_map = {
'label': tf.FixedLenFeature(
shape=[], dtype=tf.int64, default_value=[-1]),
'image': tf.FixedLenFeature(
shape=[NUM_PIXELS * NUM_PIXELS], dtype=tf.int64),
}
example = tf.parse_example(input_data, features=feature_map)
I then feed example to a convolution layer, etc. and generate the output.
Now imagine that I train my graph with that code specifying the input, save out the graph and weights, and then restore the graph and weights in another script for prediction -- I'd like to take (say) 10 images and feed them to the graph to generate predictions. How do I "inject" those 10 images so that the predictions come out the other end?
I played around with feed dictionaries and placeholders, but I'm not sure if they're the right things for me to use... it seems like they rely on having data in memory, as opposed to reading from a queue of test data, for example.
Thanks!
A feed dictionary with placeholders would make sense if you wanted to perform a small number of inferences/evaluations (i.e. enough to fit in memory) - e.g. if you were serving a simple model or running small eval loops.
If you specifically want to infer or evaluate large batches then you should use the same approach you've used for training, but with a different path to your test/eval/live data. e.g.
_, eval_data = util.read_examples(
paths_to_files, # CHANGE THIS BIT
batch_size,
shuffle=shuffle,
num_epochs=None)
You can use this as a normal python variable and set up successive, dependent steps to use this as a provided variable. e.g.
def get_example(data):
return tf.parse_example(data, features=feature_map)
sess.run([get_example(path_to_your_data)])