tflite_convert ValueError Unknown layer BatchNorm - tensorflow

We are using
Tensorflow 1.14
Keras 2.1.2
GPU: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.86
for custom object detection using Mask-RCNN from this repo https://github.com/matterport/Mask_RCNN.
Now we trained a model successfully and its detecting objects on our desktop. Now, we want to generate tflite for mobile usage where we are facing below mentioned error:
ValueError: Unknown layer BatchNorm
Please note that we have created weights and keras model .h5 with different scripts
We have tried following code to convert keras model to tflite
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_keras_model_file( 'Save-Model8.h5')
tfmodel = converter.convert()
open ("model.tflite", "wb") .write(tfmodel)

Related

how to plot input and output shapes on top of each other using polt_model in keras

I want to plot my model using Keras.utils.plot_model function. my problem is that when I plot the model, the input and output shapes do not place on top of each other and instead will be put alongside each other (like figure 1).
Here is the code to plot this model:
model = tf.keras.models.Sequential()
model.add(layers.Embedding(100, 128, input_length=45,
input_shape=(45,), name='embed'))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.MaxPooling1D(5))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=False)
but I like to have the model plot such as figure 2 which is the typical figure we can find on internet and I created it many times before.
I couldn't find any figsize or fontsize option in plot_model to try changing them. I use google Colaboratory Notebook.
Any help is very appreciated.
I also have the same issue and I finally found this github link.
github
Just because we're using tensorflow ver2.8.0, this problem seems to happen.
As mentioned in the link, one valid solution is to change our tensorflow version such as tf-nightly.
[tensorflow ver2.8.0]
import tensorflow as tf
tf.__version__
2.8.0
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(1,input_shape=[1], name="input_layer")
],name="model_1")
model.compile(...)
[tensorflow nightly]
!pip --quiet install tf-nightly #try not to use tf ver2.8
import tensorflow as tf
tf.__version__
2.10.0-dev20220403
#just do the same thing as above
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(1,input_shape=[1], name="input_layer")
],name="model_1")
model.compile(...)
I hope you solve this problem.
It is easy but using model sequence is more easily managed.
What are the embedded layers and dataset buffers !?
It is batches of input, you manage the combination or number of batches!
( Using MS-word draws the graphs is faster or drawing tools, I use free office when study )
[ Codes ]:
import tensorflow as tf
from tensorflow.keras.utils import plot_model
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape=(100,), dtype='int32', name='input'),
tf.keras.layers.Embedding(output_dim=512, input_dim=100, input_length=100),
tf.keras.layers.LSTM(32),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid', name='output'),
])
dot_img_file = 'F:\\temp\\Python\\img\\001.png'
tf.keras.utils.plot_model(model, to_file=dot_img_file, show_shapes=True)
# <IPython.core.display.Image object>
input('...')
[ Output ]:
F:\temp\Python>python test_tf_plotgraph.py
2022-03-28 14:21:26.043715: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-28 14:21:26.645113: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4565 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1
...
...

Tensorflow 2: Get the number of trainable parameters in a Model from Model Garden (Zoo)

After choosing and downloading a model from TensorFlow 2 Detection Model Zoo, it can be loaded as followed:
import tensorflow as tf
model = tf.saved_model.load(f'./efficientdet_d0_coco17_tpu-32/saved_model/')
However, it looks like one cannot extract the number of trainable variables directly/indirectly from the model variable, according to this investigation.
Nevertheless, the model training can continue, with new data, as this is a typical use-case of a pre-trained model. There must be a way to get the number of trainable variables. But I don't know how.
I tried:
tf.trainable_variables
# AttributeError: module 'tensorflow' has no attribute 'trainable_variables'
Environment:
Tensorflow 2.7.0 (implying CUDA 11.2, cuDNN 8.1).
Windows 10 x64
Python 3.9.7
NVIDIA GeForce MX150, Compute capability: 6.1

Converting tensorflow2.0 model to TensorRT engine (tensorflow2.0)

I have retrained some tensorflow2.0 model, it's working as 1 class object detector, prepared with object detection api v2 (https://tensorflow-object-detection-api-tutorial.readthedocs.io/).
After that I have converted it to onnx (tf2onnx.convert) and tested - got the same inference results.
I have tested all pretrained models (downloaded from tf model zoo https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md):
ssd_mobilenet_v2_320x320_coco17_tpu-8
ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8
ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8
ssd_resnet50_v1_fpn_640x640_coco17_tpu-8
I have retrained it by using some small batch of data.
The problem is with using it with gstreamer/deepstream. As I have seen, gstreamer consumes the onnx model, or model after converting it to TensorRT. (If I will provide onnx - model is also converted to TensorRT of course, but it's done by gstreamer right before running)
I was also trying to same pipeline with train->convert to onnx->convert to trt (or just provide onnx model to gstreamer). Same issue.
Error:
ERROR: [TRT]: [graph.cpp::computeInputExecutionUses::519] Error Code
9: Internal Error ((Unnamed Layer* 747) [Recurrence]: IRecurrenceLayer
cannot be used to compute a shape tensor)
TensorRT Version: 8.2.1.8
tf2onnx Version: 1.9.3
Is there any chance to get some help?
Or maybe I should skip the onnx model and just convert it from tensorflow to tensorRT engine? Is it possible?
Of course I can upload the model if it would help.
BR!

Tensorflow gpu error: Dst tensor not initialized

It is my first time training a model on GPU. I am using tensorflow. I am getting an error: InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run AssignVariableOp: Dst tensor is not initialized. [Op:AssignVariableOp]
I have tried for solutions like reducing batch size, use tf-nightly but to no avail. I am using Nvidia GeForce GTX 1080 8 Gb. I am trying to train an image classification model using Keras Application(Xception).

cannot fine-tune a Keras model with 4 VGG16

I build a model with 4 VGG16 (not including the top) and then concatenate the 4 outputs from the 4 VGG16 to form a dense layer, which is followed by a softmax layer, so my model has 4 inputs (4 images) and 1 output (4 classes).
I first do the transfer learning by just training the dense layers and freezing the layers from VGG16, and that works fine.
However, after unfreeze the VGG16 layers by setting layer.trainable = True, I get the following errors:
tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018󈚦󈚸 23:12:28.501894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:0a:00.0
totalMemory: 11.93GiB freeMemory: 11.71GiB
2018󈚦󈚸 23:12:28.744990: I
tensorflow/stream_executor/cuda/cuda_dnn.cc:444] could not convert BatchDescriptor {count: 0 feature_map_count: 512 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX} to cudnn tensor descriptor: CUDNN_STATUS_BAD_PARAM
Then I follow the solution in this page and set os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'. The first error above is gone, but I still get the second error:
keras tensorflow/stream_executor/cuda/cuda_dnn.cc:444 could not convert BatchDescriptor to cudnn tensor descriptor: CUDNN_STATUS_BAD_PARAM
If I freeze the VGG16 layers again, then the code works fine. In other works, those errors only occur when I set the VGG16 layers trainable.
I also build a model with only 1 VGG16, and that model also works fine.
So, in summary, only when I unfreeze the VGG16 layers in a model with 4 VGG16, I get those errors.
Any ideas how to fix this?
It turns out that it has nothing to do the number of VGG16 in the model. The problem is due to the batch size.
When I said the model with 1 VGG16 worked, that model used batch size 8. And when I reduced the batch size smaller than 4 (either 1, 2, or 3), then the same errors happened.
Now I just use batch size 4 for the model with 4 VGG16, and it works fine, although I still don't know why it fails when batch size < 4 (probably it's related to the fact I'm using 4 GPUs).