Tensorflow Model to TFLITE - tensorflow

I have this code for building a semantic search engine using pre-trained universal encoder from tensorflow hub. I am not able to convert to tlite. I have saved the model to my directory.
Importing the model:
module_path ="/content/drive/My Drive/4"
%time model = hub.load(module_path)
#print ("module %s loaded" % module_url)
#Create function for using modeltraining
def embed(input):
return model(input)
Training the model on data:
## training the model
Model_USE= embed(data)
Saving the model:
exported = tf.train.Checkpoint(v=tf.Variable(Model_USE))
exported.f = tf.function(
lambda x: exported.v * x,
input_signature=[tf.TensorSpec(shape=None, dtype=tf.float32)])
export_dir = "/content/drive/My Drive/"
tf.saved_model.save(exported,export_dir)
Saving works fine but when I convert to tflite it gives error.
Conversion code:
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
Error:
as_list() is not defined on an unknown TensorShape.

First, you should need to add a data generator to have representative inputs for the converter. Just like this:
def representative_data_gen():
for input_value in dataset.take(100):
yield [input_value]
The input value must be of shape (1, your_iput_shape) as if it had batch shape of 1. It has to be yielded as a list; mandatory.
You should also declare which type of optimization do you want, for example:
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
Nevertheless, I have also encountered problems with the different options of the converter depending on the network structure, which in this case I do not know. So, to make a clean run of the converter I would just do:
converter = lite.TFLiteConverter.from_keras_model(model)
converter.experimental_new_converter = True
converter.optimizations = [lite.Optimize.DEFAULT]
tfmodel = converter.convert()
The converter.experimental_new_converter = True is for problems when converting RNNs as in https://github.com/tensorflow/tensorflow/issues/34813
EDIT:
As explained here: ValueError: None is only supported in the 1st dimension. Tensor 'flatbuffer_data' has invalid shape '[None, None, 1, 512]' TFLite only allows the first dimension of your data to be None, that is, the batch. All other dimensions must be fixed. Try padding them with, for example, tf.keras.preprocessing.sequence.pad_sequences.
Then mask your sequences in the network as described in: tensorflow.org/guide/keras/masking_and_padding with Embedding or Masking layers.

Related

RoBERTa example from tfhub produces error "During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string"

I would like to use the roberta-base model from tfhub. I am trying to run the example below, although I get an error when I try to feed sentences to model as input. I get the following error Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string. I am using python 3.7, tensorflow 2.5, and tensorflow_hub 0.12.
If I try to replace preprocessor and encoder with the corresponding BERT versions, the code above works. However, I would like it to work for RoBERTa as well (as shown below).
preprocess = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
encoder = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4", trainable=True)
# define a text embedding model
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string)
preprocessor = hub.KerasLayer("https://tfhub.dev/jeongukjae/roberta_en_cased_preprocess/1")
encoder_inputs = preprocessor(text_input)
encoder = hub.KerasLayer("https://tfhub.dev/jeongukjae/roberta_en_cased_L-12_H-768_A-12/1", trainable=True)
encoder_outputs = encoder(encoder_inputs)
pooled_output = encoder_outputs["pooled_output"] # [batch_size, 768].
sequence_output = encoder_outputs["sequence_output"] # [batch_size, seq_length, 768].
model = tf.keras.Model(text_input, pooled_output)
# You can embed your sentences as follows
sentences = tf.constant(["(your text here)"])
print(model(sentences))
Additionally, the code above with the RoBERTa preprocessor/encoder seems to work if I use CPU instead of GPU (adding with tf.device('/cpu:0')), but this is not feasible because I need to fine-tune a model on lots of data.

TFlite model.process() sometimes needs input data TensorImage and sometimes TensorBuffer to process an image? Are there different image input data?

Some TFlite models model.process() seems to need TensorBuffer and other rather needs TensorImage . I don't know why?
First, I took a regular TensorFlow / Keras model that was saved using:
model.save(keras_model_path,
include_optimizer=True,
save_format='tf')
Then I compress and quantize this Keras model (300 MB) to a TFlite format using:
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = tf.keras.utils.image_dataset_from_directory(dir_val,
batch_size=batch_size,
image_size=(150,150))
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()
with open(tflite_model_path, 'wb') as file:
file.write(tflite_model)
I've got a lot smaller TFlite model (40 Mo) which needs TensorBuffer <input_data> when calling model.process(<input_data>)
Second, I've trained and saved as TFLite model using TensorFlow Lite Model Maker and now I've got a TFLite model that needs TensorImage <input_data> when calling model.process(<input_data>).
Are there two different TFlite models depending on how you build and train it?
Maybe it's related to the fact that the Keras model was based on Inception and the TensorFlow Lite Model Maker uses EfficientNet. How convert from one TFlite model to the other? How someone can change the input of images to be able to process the same, for example TensorImage or bitmap data input?
With the very valuable help of #Farmaker, I've solve my problem. I simply wanted to convert a Keras model into a more compact TFlite model to install it in a mobile application. I realized that the generated TFlite model was not compatible and #Farmaker pointed out to me very correctly that the metadata was missing.
You should use TensorFlow 2.6.0 or less because incompatibility with Flatbuffer.
pip3 uninstall tensorflow
pip3 install tensorflow==2.6.0
pip3 install keras==2.6.0
Convert the Keras model to TFlite
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = tf.keras.utils.image_dataset_from_directory(dir_val,
batch_size=batch_size,
image_size=(150,150))
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()
with open(tflite_model_path, 'wb') as file:
file.write(tflite_model)
Add metadata as shown here in the « TensorFlow Lite Metadata Writer API » tutorial
3.1 Provide a labels.txt file (a file of all the target class labels, one label by line)
For instance, to create such a file,
your_labels_list = [
'class1','class2',...]
with open('labels.txt', 'w') as labels_file:
for label in your_labels_list:
labels_file.write(label + "\n")
3.2 Provide extra library to support TFlite metadata generation
pip3 install tflite-support-nightly
3.3 Generate the metadata
from tflite_support.metadata_writers import image_classifier
from tflite_support.metadata_writers import writer_utils
ImageClassifierWriter = image_classifier.MetadataWriter
# Normalization parameters are required when processing the image
# https://www.tensorflow.org/lite/convert/metadata#normalization_and_quantization_parameters)
_INPUT_NORM_MEAN = 127.5
_INPUT_NORM_STD = 127.5
_TFLITE_MODEL_PATH = "<your_path_to_model.tflite>"
_LABELS_FILE = ""<your_path_to_labels.txt>""
_TFLITE_METADATA_MODEL_PATHS = ""<your_path_to_model_with_metadata.tflite>""
# Create the metadata writer
metadata_generator = ImageClassifierWriter.create_for_inference(
writer_utils.load_file(_TFLITE_MODEL_PATH),
[_INPUT_NORM_MEAN], [_INPUT_NORM_STD],
[_LABELS_FILE])
# Verify the metadata generated
print(metadata_generator.get_metadata_json())
# Integrate the metadata into the TFlite model
writer_utils.save_file(metadata_generator.populate(), _TFLITE_METADATA_MODEL_PATHS)
That's all folks!
You can use tdfs, dataset, dataset_image, tf.constants, and other data formats.
You also can use tf.constants where you input required parameters OR you can input weights algorithms. ( Convolution layer also capable )
I determine the input and target response catagorizes.
[ Sequence to Sequence mapping ]:
group_1_ShoryuKen_Left = tf.constant([ 0,0,0,0,0,1,0,0,0,0,0,0, 0,0,0,0,0,1,0,1,0,0,0,0, 0,0,0,0,0,0,0,1,0,0,0,0, 0,0,0,0,0,0,0,0,0,1,0,0 ], shape=(1, 1, 48), dtype=tf.float32)
# get_weights
layer1_lstm = model.get_layer( name="layer1_bidirection-lstm" )
lstm_weight_1 = layer1_lstm.get_weights()[0]
lstm_filter_1 = layer1_lstm.get_weights()[1]
# set weights
layer1_lstm = model.get_layer( name="layer1_bidirection-lstm " )
layer1_conv.set_weights([lstm_weight_1, lstm_filter_1])
[ TDFS ]:
builder = tfds.builder('cats_vs_dogs', data_dir='file:\\\\F:\\datasets\\downloads\\PetImages\\')
ds = tfds.load('cats_vs_dogs', split='train', shuffle_files=True)
assert isinstance(ds, tf.data.Dataset)
data = DataLoader.from_folder('F:\\datasets\\downloads\\flower_photos\\')
train_data, test_data = data.split(0.9)
for example in ds.take(1):
image, label = example["image"], example["label"]
model = image_classifier.create(train_data)
...

Convertted tensorflow model to tflite outputs a int8 which I cannot dequantize

I currently have quantized a tensorflow model using the following class script:
class QuantModel():
def __init__(self, model=tf.keras.Model,data=[]):
'''
1. Accepts a keras model, long term will allow saved model and other formats
2. Accepts a numpy or tensor data of the format such that indexing such as
data[0] will return one input in the correct format to be fed forward through the
network
'''
self.data=data
self.model=model
'''Added script to quantize model and allows custom ops
for Logmelspectrogram operations (Might cause mix quantization)'''
def quant_model_int8(self):
converter = tf.lite.TFLiteConverter.from_keras_model(self.model)
converter.representative_dataset=self.representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
#converter.allow_custom_ops=True
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model_quant = converter.convert()
open("converted_model2.tflite",'wb').write(tflite_model_quant)
return tflite_model_quant
'''Returns a tflite model with no quantization i.e. weights and variable data all
in float32'''
def convert_tflite_no_quant(self):
converter = tf.lite.TFLiteConverter.from_keras_model(self.model)
tflite_model = converter.convert()
open("converted_model.tflite",'wb').write(tflite_model)
return tflite_model
def representative_data_gen(self):
# Model has only one input so each data point has one element.
yield [self.data]
I am able to successfully quantize my model, however the input & output is int8 as those are the options once you quantize.
Now to run the modle I am using the tf.quantization.quantize to change my input data to a qint data format and feed it through my network. So as expected I get an output which is int8.
I want to conveert the output back to float32 and inspect it. For that i am using tf.dequantize. However that only works with tf.qint8 data types.
Wondering how to handle this and if any of you have run into similar issue?
# Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="converted_model2.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
data_arr= np.load('Data_Mel.npy')
print(data_arr.shape)
sample=data_arr[0]
print(sample.shape)
minn=min(sample.flatten())
maxx=max(sample.flatten())
print(minn,maxx)
(sample,sample_1,sample_2)=tf.quantization.quantize(data_arr[0],minn,maxx,tf.qint8)
print(sample.shape)
# Test the model on random input data.
input_shape = input_details[0]['shape']
input_data = sample
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data.dtype)
output_data=tf.quantization.dequantize(output_data,minn,maxx)
print(output_data)
I think you can simply remove the converter.inference_input_type = tf.int8 and converter.inference_output_type = tf.int8 flags and treat the output model as a float model. Here is some detail:
The "optimization" flag in the Converter quantizes the float model to int8. By default, it adds a [Quant] op in the beginning of the quantized model as well as a [Dequant] at the end:
(float) ->[Quant] -> (int8) -> [op1] -> (int8) -> [op...] -> (int8) -> [Dequant] -> (float)
So you don't need to change any of your driver logic since the overall model still has float interface while the [op]s are quantized.
The extra flag converter.inference_input_type = tf.int8 and converter.inference_output_type = tf.int8 allows you to remove the [Quant] and [Dequant] operation so the quantized model looks like this:
(int8) -> [op1] -> (int8) -> [op...] -> (int8)
This is for deployment on certain hardware/workflow. Since you are adding [Quant] and [Dequant] manually, the quantized model with float interface could work better for your case.

Object Detection API v2 Tflite model post quantization

I try to convert and quantize a model trained with the Object Detection API v2 to run it on a Coral Devboard.
It seems like there is still a big problem with exporting Object Detection Models to lite, though I hope that maybe someone has some advice for me.
my converter looks like the following and I try to convert "SSD MobileNet v2 320x320" from Model Zoo v2
def convertModel(input_dir, output_dir, pipeline_config="", checkpoint:int=-1, quantization=False ):
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
files = os.listdir(input_dir)
if pipeline_config == "":
pipeline_config = [pipe for pipe in files if pipe.endswith(".config")][0]
pipeline_config_path = os.path.join(input_dir, pipeline_config)
# Find latest or given checkpoint
checkpoint_file = ""
checkpointDir = os.path.join(input_dir, 'checkpoint')
for chck in sorted(os.listdir(checkpointDir)):
if chck.endswith(".index"):
checkpoint_file = chck[:-6]
# Stop search when the requested was found
if chck.endswith(str(checkpoint)):
break
print("#####################################")
print(checkpoint_file)
print("#####################################")
#ckeckpint_file = [chck for chck in files if chck.endswith(f"{checkpoint}.meta")][0]
trained_checkpoint_prefix = os.path.join(checkpointDir, checkpoint_file)
configs = config_util.get_configs_from_pipeline_file(pipeline_config_path)
detection_model = model_builder.build(configs['model'], is_training=False)
ckpt = tf.compat.v2.train.Checkpoint(
model=detection_model)
ckpt.restore(trained_checkpoint_prefix).expect_partial()
class MyModel(tf.keras.Model):
def __init__(self, model):
super(MyModel, self).__init__()
self.model = model
self.seq = tf.keras.Sequential([
tf.keras.Input([300,300,3], 1),
])
def call(self, x):
x = self.seq(x)
images, shapes = self.model.preprocess(x)
prediction_dict = self.model.predict(images, shapes)
detections = self.model.postprocess(prediction_dict, shapes)
return detections
km = MyModel(detection_model)
y = km.predict(np.random.random((1,300,300,3)).astype(np.float32))
converter = tf.lite.TFLiteConverter.from_keras_model(km)
if quantization:
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.target_spec.supported_ops = [ tf.lite.OpsSet.SELECT_TF_OPS, tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.representative_dataset = _genDataset
else:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
converter.experimental_new_converter = True
converter.allow_custom_ops = True
tflite_model = converter.convert()
open(os.path.join(output_dir, 'model.tflite'), 'wb').write(tflite_model)
My Datagenerator loads about 100 images downloaded from the coco dataset to generate sample inputs
def _genDataset():
sampleDir = os.path.join("Dataset", "Coco")
for i in os.listdir(sampleDir):
image = cv2.imread(os.path.join(sampleDir, i))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (300,300))
image = image.astype("float")
image = np.expand_dims(image, axis=1)
image = image.reshape(1, 300, 300, 3)
yield [image.astype("float32")]
I tried to tun the code with TF2.2.0 which returned me
RuntimeError: Max and min for dynamic tensors should be recorded during calibration
according to an update to TF2.3.0 should help when then returns me
<unknown>:0: error: failed while converting: 'main': Ops that can be supported by the flex runtime (enabled via setting the -emit-select-tf-ops flag):
tf.Size {device = ""}
I also tested tf-nightly (2.4.0) which returns again
RuntimeError: Max and min for dynamic tensors should be recorded during calibration
Right now this tf.Size operator seems to be the reason why I can convert the model because when I allow custom operations I can convert it to tflite.
Sadly that is not a solution for me because the coral converter or my interpreter can't use the model with a missing custom op.
Does someone know if there is a possibility to remove this op in postprocessing or just ignore it during conversion?
Just converting it to TFlite without quantization and tf.lite.OpsSet.TFLITE_BUILTINS works without problems

Tensorflow Serving, online predictions: How to build a signature_def that accepts 'image_bytes' as input tensor name?

I have successfully trained a Keras model and used it for predictions on my local machine, now i want to deploy it using Tensorflow Serving. My model takes images as input and returns a mask prediction.
According to the documentation here my instances need to be formatted like this:
{'image_bytes': {'b64': base64.b64encode(jpeg_data).decode()}}
Now, the saved_model.pb file automatically saved by my Keras model has the following tensor names:
input_tensor = graph.get_tensor_by_name('input_image:0')
output_tensor = graph.get_tensor_by_name('conv2d_23/Sigmoid:0')
therefore i need to save a new saved_model.pb file with a different signature_def.
I tried the following (see here for reference), which works:
with tf.Session(graph=tf.Graph()) as sess:
tf.saved_model.loader.load(sess, ['serve'], 'path/to/saved/model/')
graph = tf.get_default_graph()
input_tensor = graph.get_tensor_by_name('input_image:0')
output_tensor = graph.get_tensor_by_name('conv2d_23/Sigmoid:0')
tensor_info_input = tf.saved_model.utils.build_tensor_info(input_tensor)
tensor_info_output = tf.saved_model.utils.build_tensor_info(output_tensor)
prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'image_bytes': tensor_info_input},
outputs={'output_bytes': tensor_info_output},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
builder = tf.saved_model.builder.SavedModelBuilder('path/to/saved/new_model/')
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={'predict_images': prediction_signature, })
builder.save()
but when i deploy the model and request predictions to the AI platform, i get the following error:
RuntimeError: Prediction failed: Error processing input: Expected float32, got {'b64': 'Prm4OD7JyEg+paQkPrGwMD7BwEA'} of type 'dict' instead.
readapting the answer here, i also tried to rewrite
input_tensor = graph.get_tensor_by_name('input_image:0')
as
image_placeholder = tf.placeholder(tf.string, name='b64')
graph_input_def = graph.as_graph_def()
input_tensor, = tf.import_graph_def(
graph_input_def,
input_map={'b64:0': image_placeholder},
return_elements=['input_image:0'])
with the (wrong) understanding that this would add a layer on top of my input tensor with matching 'b64' name (as per documentation) that accepts a string and connects it the original input tensor
but the error from the AI platform is the same.
(the relevant code i use for requesting a prediction is:
instances = [{'image_bytes': {'b64': base64.b64encode(image).decode()}}]
response = service.projects().predict(
name=name,
body={'instances': instances}
).execute()
where image is a numpy.ndarray of dtype('float32'))
I feel i'm close enough but i'm definitely missing something. Can you please help?
After b64 encoded -> decoded, the buffer of img will be changed to type string and not fit your model input type.
You may try to add some preprocess in your model and send b64 request again.