Different results in TfLite model vs model before quantization - tensorflow

I have taken Object Detection model from TF zoo v2,
I took mobilenet and trained it on my own TFrecords
I am using mobilenet because it is often found in the examples of converting it to Tflite and this is what I need because I run it on RPi3.
I am following ideas from the official example from Sagemaker docs
and github you can find here
What is interesting the accuracy done after step 2) training and 3) deploying is pretty nice! My trucks are discovered nicely with the custom trained model.
However, when converted to tflite the accuracy goes down no matter if I use tfliteconvert tool or using python tf.lite.Converter.
What is more, all detections are on borders of images, and usually in the bottom-right corner. Maybe I am not preparing images correctly? Or some misunderstanding of results?
You can check images I uploaded.
https://ibb.co/fSzfZvz
https://ibb.co/0GF101s
What could possibly go wrong?

I was lacking proper preprocessing of image.
After I have used pipeline config to build detection object which has preprocess function I utilized to build tensor before feeding it into Interpreter.
num_classes = 2
configs = config_util.get_configs_from_pipeline_file(pipeline_config)
model_config = configs['model']
model_config.ssd.num_classes = num_classes
model_config.ssd.freeze_batchnorm = True
detection_model = model_builder.build(
model_config=model_config, is_training=True)

Related

How to obtain the ResNet component of the Tensorflow implementation of SimCLR v2?

I am currently trying to create embeddings of images by passing them through pre-trained Neural Networks and getting the values obtained at the last layer just before the fully-connected ones. I did not have much problem doing it with Pytorch implementations of other Neural Networks. However, I am stuck with the Tensorflow implementation of SimCLR v2 and do not know how to proceed.
The official repo of SimCLR v2 is this one: https://github.com/google-research/simclr
And the paper is here: https://arxiv.org/abs/2006.10029v2
If I understood correctly the paper and the code, this architecture is composed of a backbone ResNet as well as a projection head. In my case, I am not interested in the projection head and just want to obtain the results of the output of the ResNet model.
Looking at the code in the colabs, I have managed to import pre-trained SimCLR models:
model_path = 'gs://simclr-checkpoints-tf2/simclrv2/pretrained/r50_1x_sk0/saved_model'
saved_model = tf.saved_model.load(model_path)
However, I do not know what to do to get the outputs of the ResNet. In all the colabs, they only get the outputs of the projection head which I am uninterested in.
for x in ds.take(1):
image = x['image']
labels = x['label']
logits = saved_model(image, trainable=False)['logits_sup']
pred = tf.argmax(logits, -1)
Moreover, the way the model is imported makes it difficult to get the variables and layers. For instance if I try obtain a summary of the model, I have this error:
'_UserObject' object has no attribute 'summary'
I also do not want to convert the weights of Tensorflow into Pytorch and import them into a pytorch ResNet.
What then would be the best way to isolate the ResNet from the overall SimCLR v2 architecture in order to get the outputs of the final layer ?

Tensorflow image classification example

This is my first time doing image classification, I followed this tutorial:
https://www.tensorflow.org/tutorials/images/classification
I'm wondering, how do I take that model, and actually use it to make predictions?
I would just to put one image into the model, and would ideally like to get a prediction % of whether it thinks its a dog or a cat.
I saved the model using:
model.save(my_model.h5)
But am really lost at the next steps.
There's another Tensorflow tutorial which uses model.predict() specifically: Basic classification: Classify images of clothing
Not sure if my code is correct all the way but I tried to extend the prediction part of the cats/dogs tutorial using model.predict_generator() though I can't seem to entirely understand the results I get. Adapted code from this second tutorial: Tutorial on using Keras flow_from_directory and generators
# Preparing the testing dataset
test_dir = os.path.join(os.getcwd(), 'cat_dog_testing') # directory with test images
test_image_generator = ImageDataGenerator(rescale=1./255) # rescaling pixels 0 to 1
test_generator = test_image_generator.flow_from_directory(batch_size=6,
directory=test_dir,
shuffle=False,
target_size=(IMG_HEIGHT,IMG_WIDTH),
class_mode=None)
STEP_SIZE_TEST=test_generator.n//test_generator.batch_size
test_generator.reset()
pred=model_new.predict_generator(test_generator, steps=STEP_SIZE_TEST, verbose=1)
I built a tensorflow image classification workflow so that you can both train and classify images with no code. It's on FlyteHub if you want to see it
https://flytehub.org/trainandclassifyimages
Happy to collaborate if you have improvements you want to make to the codebase :)

How to do fine-tuning in tensorflow with notop layers and define my own input image size

There are many examples about how to do fine-tuning with tensorflow. Almost all these examples are try to resize our images to the specified size that the existing model needs. Like for example, 224×224 is the input size that vgg19 needs. However, in keras, we can change the input size by setting the include_top to false:
base_model = VGG19(include_top=False, weights="imagenet", input_shape=(input_size, input_size, input_channels))
Then we do not have to fix the image size to be 224×224 anymore. Can we do such kind of fine-tuning by using official pre-trained models in tensorflow? I cannot find the solutions up till now, anyone help me?
Yes, it is possible to do this kind of fine-tuning. You would just have to ensure that you also fine-tune some of the first few layers (to account for changed input) of the original network in addition to the last few layers (to account for changed output).
I work with TensorFlow using Keras. If you are open to that, then there is a code snippet that shows the general fine-tuning flow here:
https://keras.io/applications/
Specifically, I had to write the following code to make it work for my case:
#img_width,img_height is the size of your new input, 3 is the number of channels
input_tensor = Input(shape=(img_width, img_height, 3))
base_model =
keras.applications.vgg19.VGG19(include_top=False,weights='imagenet', input_tensor=input_tensor)
#instantiate whatever other layers you need
model = Model(inputs=base_model.inputs, outputs=predictions)
#predictions is the new logistic layer added to account for new classes
Hope this helps.

how to connect the pretrained model's input to the output of tf.train.shuffle_batch?

In classify_image.py, the input image is fed with a loaded image in
predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0': image_data})
What if I want to add new layers to the inception model and train the whole model again? Are the variables loaded from classify_image_graph_def.pb trainable? I saw that freeze_graph.py used convert_variables_to_constants to produce freezed graph. So can those loaded weights be trained again, are they constants? And how can I connect the input('shuffle_batch:0') to the inception model to the output of tf.train.shuffle_batch?
The model used in classify_image.py has its variables frozen into constants, and doesn't have any gradient ops, so it's not easy to turn it back into something trainable. You can see how we remove one layer and replace it with something trainable here:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py
It's hard to generalize though. You'd be better off looking at some examples of fine-tuning here:
https://github.com/tensorflow/models/tree/master/inception#how-to-fine-tune-a-pre-trained-model-on-a-new-task

How to retrain inception-v1 model?

I have successfully gone through the official tutorial, which explains how to retrain inception-v3 model and later successfully retrained the same model o train the model for specific purposes.
The model, however, is complex and slow compared to other, simpler models, such as inception-v1 which accuracy is good enough for some tasks. Specifically, I would like to retrain the model to use it on Android and ideally the performance in terms of speed should be comparable to original TensorFlow Android demo. Anyway, I tried to retrain the inception-v1 model from this link with following modifications in retrain.py:
BOTTLENECK_TENSOR_NAME = 'avgpool0/reshape:0'
BOTTLENECK_TENSOR_SIZE = 2048
MODEL_INPUT_WIDTH = 224
MODEL_INPUT_HEIGHT = 224
MODEL_INPUT_DEPTH = 3
JPEG_DATA_TENSOR_NAME = 'input'
RESIZED_INPUT_TENSOR_NAME = 'input'
As opposed to inception v3, inception v1 does not have any decodeJpeg or resize nodes:
inception v3 nodes:
DecodeJpeg/contents
DecodeJpeg
Cast
ExpandDims/dim
ExpandDims
ResizeBilinear/size
ResizeBilinear
...
pool_3
pool_3/_reshape/shape
pool_3/_reshape
softmax/weights
softmax/biases
softmax/logits/MatMul
softmax/logits
softmax
inception v1 nodes:
input
conv2d0_w
conv2d0_b
conv2d1_w
conv2d1_b
conv2d2_w
conv2d2_b
...
softmax1_pre_activation
softmax1
avgpool0/reshape/shape
avgpool0/reshape
softmax2_pre_activation/matmul
softmax2_pre_activation
softmax2
output
output1
output2
so I guess the images have to be reshaped before being fed into the graph.
Right now the error occurs when hitting the following function:
def run_bottleneck_on_image(sess, image_data, image_data_tensor,
bottleneck_tensor):
"""Runs inference on an image to extract the 'bottleneck' summary layer.
Args:
sess: Current active TensorFlow Session.
image_data: Numpy array of image data.
image_data_tensor: Input data layer in the graph.
bottleneck_tensor: Layer before the final softmax.
Returns:
Numpy array of bottleneck values.
"""
bottleneck_values = sess.run(
bottleneck_tensor,
{image_data_tensor: image_data})
bottleneck_values = np.squeeze(bottleneck_values)
return bottleneck_values
Error:
TypeError: Cannot interpret feed_dict key as Tensor: Can not convert a
Operation into a Tensor.
I guess the data on input node of inception v1 graph has to be reshaped to match the data after passing the following nodes in inception v3:
DecodeJpeg/contents
DecodeJpeg
Cast
ExpandDims/dim
ExpandDims
ResizeBilinear/size
ResizeBilinear
If anyone has already managed to retrain the inception v1 model or has an idea how to reshape the data in inception v1 case to match inception v3, I would be very thankful for any tips or suggestions.
Not sure if you have solved this or not but I am working on a similar problem.
I am trying to use a different model (not Inception-v1 or Inception-v3) with the Inception-v3 transfer learning tutorial. This post seems to be on the right track of remapping the input of the new model (in your case inception-v1) to play nice with the jpeg encoding used in the rest of the tutorial:
feeding image data in tensorflow for transfer learning
The only problem I am having is a error in my input saying "Cannot convert a tensor of type uint8 to an input type of float32" but this may at least put you on the right track.
Good Luck!
(For the ones who are still interested)
Bottleneck tensor size should be 1024 for inception-v1. For me, the following setup works with mentioned inception-v1 for this retrain script. No need for jpeg data tensor or else.
bottleneck_tensor_name = 'avgpool0/reshape:0'
bottleneck_tensor_size = 1024
input_width = 224
input_height = 224
input_depth = 3
resized_input_tensor_name = 'input:0'