I have trained an Object Detection model using the TensorFlow API by following the steps provided in this official tutorial. As such, by the end of the whole process, as described in the exporting step, I have got my model saved in the following format.
my_model/
├─ checkpoint/
├─ saved_model/
└─ pipeline.config
My question is, once the model has been saved to such a format, how can I load it and use it to make detections?
I am able to successfully do that with the training checkpoints by using the code below. And it is after that point (where I load the checkpoint that generated the best result) that I export the model.
# Load pipeline config and build a detection model
configs = config_util.get_configs_from_pipeline_file(PATH_TO_PIPELINE_CONFIG)
model_config = configs['model']
detection_model = model_builder.build(model_config=model_config, is_training=False)
# Restore checkpoint
ckpt = tf.compat.v2.train.Checkpoint(model=detection_model)
ckpt.restore(PATH_TO_CKPT).expect_partial()
However, in production, I am not looking to use those checkpoints. I am looking to load the model from the exported format.
I have tried the following command to load the exported model, but I have had no luck. It returns no errors and I can use the model variable below to make detections, but the output (bounding boxes, classes, scores) is incorrect, which leads me to believe there are some steps missing in the loading process.
model = tf.saved_model.load(path_to_exported_model)
Any tips?
Ok, as it turns out, the code is correct. I ran a test with another model (which is also an EfficientDet) and the code worked. It seems something went wrong when the original model was exported, which I am still trying to figure out.
To those looking for an answer, here's the full code for loading and using a saved model.
# Loading saved mode.
model = tf.saved_model.load(path_to_exported_model)
# Pre-processing image.
image = tf.image.decode_image(open(path_to_image, 'rb').read(), channels=3)
image = tf.expand_dims(image, 0)
image = tf.image.resize(image, (size_expected_by_model, size_expected_by_model))
# Model expects tf.uint8 tensor, but image is read as tf.float32.
image = tf.image.convert_image_dtype(image, tf.uint8)
# Executing object detection.
detections = model(image)
# Formatting returned detections.
num_detections = int(detections.pop('num_detections'))
detections = {key: value[0, :num_detections].numpy()
for key, value in detections.items()}
detections['num_detections'] = num_detections
detections['detection_classes'] = detections['detection_classes'].astype(np.int64)
Check this link.....Abdul Rehman has few python codes to run detection of saved_models to inference on Images as well as Videos......I use the codes extensively, to check on detection on saved_models from the TF2 Model Zoo, as well as trained models on custom datasets......
https://github.com/abdelrahman-gaber/tf2-object-detection-api-tutorial
Looking at what #Matheus Correia posted, I slightly modified his answer to suit what I was doing (in 2022) in google colabs.
category_index = your_generated_label_map
# e.g. category_index = {1: {'id': 1, 'name': 'tomato'}, 2: {'id': 2, 'name': 'egg'}, 3: {'id': 3, 'name': 'potato'}, 4: {'id': 4, 'name': 'broccoli'}, 5: {'id': 5, 'name': 'beef'}, 6: {'id': 6, 'name': 'chicken'}}
# set your own threshold here
Threshold = 0.5
def ExtractBBoxes(bboxes, bclasses, bscores, im_width, im_height):
bbox = []
class_labels = []
for idx in range(len(bboxes)):
if bscores[idx] >= Threshold:
y_min = int(bboxes[idx][0] * im_height)
x_min = int(bboxes[idx][1] * im_width)
y_max = int(bboxes[idx][2] * im_height)
x_max = int(bboxes[idx][3] * im_width)
class_label = category_index[int(bclasses[idx])]['name']
class_labels.append(class_label)
bbox.append([x_min, y_min, x_max, y_max, class_label, float(bscores[idx])])
return (bbox, class_labels)
# #Matheus Correia's code but modified
# Loading saved mode.
detect_fn = tf.saved_model.load("--saved model folder's path--")
# model = tf.saved_model.load("--saved model folder's path--")
# Pre-processing image.
image = tf.image.decode_image(open(IMAGE_PATH, 'rb').read(), channels=3)
image = tf.image.resize(image, (width,height))
im_height, im_width, _ = image.shape
# Model expects tf.uint8 tensor, but image is read as tf.float32.
input_tensor = np.expand_dims(image, 0)
detections = detect_fn(input_tensor)
bboxes = detections['detection_boxes'][0].numpy()
bclasses = detections['detection_classes'][0].numpy().astype(np.int32)
bscores = detections['detection_scores'][0].numpy()
det_boxes, class_labels = ExtractBBoxes(bboxes, bclasses, bscores, im_width, im_height)
print(class_labels)
Hope this helps.
I'm trying to build the model illustrated in this picture:
I obtained a pre-trained BERT and respective tokenizer from HuggingFace's transformers in the following way:
from transformers import AutoTokenizer, TFBertModel
model_name = "dbmdz/bert-base-italian-xxl-cased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
bert = TFBertModel.from_pretrained(model_name)
The model will be fed a sequence of italian tweets and will need to determine if they are ironic or not.
I'm having problems building the initial part of the model, which takes the inputs and feeds them to the tokenizer in order to get a representation I can feed to BERT.
I can do it outside of the model-building context:
my_phrase = "Ciao, come va?"
# an equivalent version is tokenizer(my_phrase, other parameters)
bert_input = tokenizer.encode(my_phrase, add_special_tokens=True, return_tensors='tf', max_length=110, padding='max_length', truncation=True)
attention_mask = bert_input > 0
outputs = bert(bert_input, attention_mask)['pooler_output']
but I'm having troubles building a model that does this. Here is the code for building such a model (the problem is in the first 4 lines ):
def build_classifier_model():
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
encoder_inputs = tokenizer(text_input, return_tensors='tf', add_special_tokens=True, max_length=110, padding='max_length', truncation=True)
outputs = bert(encoder_inputs)
net = outputs['pooler_output']
X = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True, dropout=0.1, recurrent_dropout=0.1))(net)
X = tf.keras.layers.Concatenate(axis=-1)([X, input_layer])
X = tf.keras.layers.MaxPooling1D(20)(X)
X = tf.keras.layers.SpatialDropout1D(0.4)(X)
X = tf.keras.layers.Flatten()(X)
X = tf.keras.layers.Dense(128, activation="relu")(X)
X = tf.keras.layers.Dropout(0.25)(X)
X = tf.keras.layers.Dense(2, activation='softmax')(X)
model = tf.keras.Model(inputs=text_input, outputs = X)
return model
And when I call the function for creating this model I get this error:
text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples).
One thing I thought was that maybe I had to use the tokenizer.batch_encode_plus function which works with lists of strings:
class BertPreprocessingLayer(tf.keras.layers.Layer):
def __init__(self, tokenizer, maxlength):
super().__init__()
self._tokenizer = tokenizer
self._maxlength = maxlength
def call(self, inputs):
print(type(inputs))
print(inputs)
tokenized = tokenizer.batch_encode_plus(inputs, add_special_tokens=True, return_tensors='tf', max_length=self._maxlength, padding='max_length', truncation=True)
return tokenized
def build_classifier_model():
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
encoder_inputs = BertPreprocessingLayer(tokenizer, 100)(text_input)
outputs = bert(encoder_inputs)
net = outputs['pooler_output']
# ... same as above
but I get this error:
batch_text_or_text_pairs has to be a list (got <class 'keras.engine.keras_tensor.KerasTensor'>)
and beside the fact I haven't found a way to convert that tensor to a list with a quick google search, it seems weird that I have to go in and out of tensorflow in this way.
I've also looked up on the huggingface's documentation but there is only a single usage example, with a single phrase, and what they do is analogous at my "out of model-building context" example.
EDIT:
I also tried with Lambdas in this way:
tf.executing_eagerly()
def tokenize_tensor(tensor):
t = tensor.numpy()
t = np.array([str(s, 'utf-8') for s in t])
return tokenizer(t.tolist(), return_tensors='tf', add_special_tokens=True, max_length=110, padding='max_length', truncation=True)
def build_classifier_model():
text_input = tf.keras.layers.Input(shape=(1,), dtype=tf.string, name='text')
encoder_inputs = tf.keras.layers.Lambda(tokenize_tensor, name='tokenize')(text_input)
...
outputs = bert(encoder_inputs)
but I get the following error:
'Tensor' object has no attribute 'numpy'
EDIT 2:
I also tried the approach suggested by #mdaoust of wrapping everything in a tf.py_function and got this error.
def py_func_tokenize_tensor(tensor):
return tf.py_function(tokenize_tensor, [tensor], Tout=[tf.int32, tf.int32, tf.int32])
eager_py_func() missing 1 required positional argument: 'Tout'
Then I defined Tout as the type of the value returned by the tokenizer:
transformers.tokenization_utils_base.BatchEncoding
and got the following error:
Expected DataType for argument 'Tout' not <class
'transformers.tokenization_utils_base.BatchEncoding'>
Finally I unpacked the value in the BatchEncoding in the following way:
def tokenize_tensor(tensor):
t = tensor.numpy()
t = np.array([str(s, 'utf-8') for s in t])
dictionary = tokenizer(t.tolist(), return_tensors='tf', add_special_tokens=True, max_length=110, padding='max_length', truncation=True)
#unpacking
input_ids = dictionary['input_ids']
tok_type = dictionary['token_type_ids']
attention_mask = dictionary['attention_mask']
return input_ids, tok_type, attention_mask
And get an error in the line below:
...
outputs = bert(encoder_inputs)
ValueError: Cannot take the length of shape with unknown rank.
For now I solved by taking the tokenization step out of the model:
def tokenize(sentences, tokenizer):
input_ids, input_masks, input_segments = [],[],[]
for sentence in sentences:
inputs = tokenizer.encode_plus(sentence, add_special_tokens=True, max_length=128, pad_to_max_length=True, return_attention_mask=True, return_token_type_ids=True)
input_ids.append(inputs['input_ids'])
input_masks.append(inputs['attention_mask'])
input_segments.append(inputs['token_type_ids'])
return np.asarray(input_ids, dtype='int32'), np.asarray(input_masks, dtype='int32'), np.asarray(input_segments, dtype='int32')
The model takes two inputs which are the first two values returned by the tokenize funciton.
def build_classifier_model():
input_ids_in = tf.keras.layers.Input(shape=(128,), name='input_token', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(128,), name='masked_token', dtype='int32')
embedding_layer = bert(input_ids_in, attention_mask=input_masks_in)[0]
...
model = tf.keras.Model(inputs=[input_ids_in, input_masks_in], outputs = X)
for layer in model.layers[:3]:
layer.trainable = False
return model
I'd still like to know if someone has a solution which integrates the tokenization step inside the model-building context so that an user of the model can simply feed phrases to it to get a prediction or to train the model.
text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples).
Solution to the above error:
Just use text_input = 'text'
instead of
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
It looks like this is not TensorFlow compatible.
https://huggingface.co/dbmdz/bert-base-italian-xxl-cased#model-weights
Currently only PyTorch-Transformers compatible weights are available. If you need access to TensorFlow checkpoints, please raise an issue!
But remember that some things are easier if you don't use keras's functional-model-api. That's what got <class 'keras.engine.keras_tensor.KerasTensor'> is complaining about.
Try passing a tf.Tensor to see if that works.
What happens when you try:
text_input = tf.constant('text')
Try writing your model as a subclass of model.
Yeah, my first answer was wrong.
The problem is that tensorflow has two types of tensors. Eager tensors (these have a value). And "symbolic tensors" or "graph tensors" that don't have a value, and are just used to build up a calculation.
Your tokenize_tensor function expects an eager tensor. Only eager tensors have a .numpy() method.
def tokenize_tensor(tensor):
t = tensor.numpy()
t = np.array([str(s, 'utf-8') for s in t])
return tokenizer(t.tolist(), return_tensors='tf', add_special_tokens=True, max_length=110, padding='max_length', truncation=True)
But keras Input is a symbolic tensor.
text_input = tf.keras.layers.Input(shape=(1,), dtype=tf.string, name='text')
encoder_inputs = tf.keras.layers.Lambda(tokenize_tensor, name='tokenize')(text_input)
To fix this, you can use tf.py_function. It works in graph mode, and will call the wrapped function with eager tensors when the graph is executed, instead of passing it the graph-tensors while the graph is being constructed.
def py_func_tokenize_tensor(tensor):
return tf.py_function(tokenize_tensor, [tensor])
...
encoder_inputs = tf.keras.layers.Lambda(py_func_tokenize_tensor, name='tokenize')(text_input)
Found this Use `sentence-transformers` inside of a keras model and this amazing articles https://www.philschmid.de/tensorflow-sentence-transformers, which explain you how to do what you're trying to achieve.
The first one is using the py_function approach, the second uses tf.Model to wrap everything into model classes.
Hope this helps anyone arriving here in the future.
This is how to use tf.py_function correctly to create a model that takes string as an input:
model_name = "dbmdz/bert-base-italian-xxl-cased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
bert = TFBertModel.from_pretrained(model_name)
def build_model():
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
def encode_text(text):
inputs = [tf.compat.as_str(x) for x in text.numpy().tolist()]
tokenized = tokenizer(
inputs,
return_tensors='tf',
add_special_tokens=True,
max_length=110,
padding='max_length',
truncation=True)
return tokenized['input_ids'], tokenized['attention_mask']
input_ids, attention_mask = tf.py_function(encode_text, inp=[text_input], Tout=[tf.int32, tf.int32])
input_ids = tf.ensure_shape(input_ids, [None, 110])
attention_mask = tf.ensure_shape(attention_mask, [None, 110])
outputs = bert(input_ids, attention_mask)
net = outputs['last_hidden_state']
# Some other layers, this part is not important
x = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True))(net)
x = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1, name='classifier'))(x)
return tf.keras.Model(inputs=text_input, outputs=x)
I use last_hidden_state instead of pooler_output, that's where outputs for each token in the sequence are located. (See discussion here on difference between last_hidden_state and pooler_output). We usually use last_hidden_state when doing token level classification (e.g. named entity recognition).
To use pooler_output would be even simpler, e.g:
net = outputs['pooler_output']
x = tf.keras.layers.Dense(1, name='classifier')(net)
return tf.keras.Model(inputs=text_input, outputs=x)
pooler_output can be used in simpler classification problems (like irony detection), but of course it's still possible to use last_hidden_state to create more powerful models. (When you use bert(input_ids_in, attention_mask=input_masks_in)[0] in your solution, it actually returns last_hidden_state.)
Making sure the model works:
model = build_model()
my_phrase = "Ciao, come va?"
model(tf.constant([my_phrase]))
>>> <tf.Tensor: shape=(1, 110, 1), dtype=float32, numpy=...>,
Making sure HuggingFace part of the model is trainable:
model.summary(show_trainable=True)
I am trying to convert a tf2.keras model to tflite, but get the following error:
ValueError: Invalid input size: expected 2 items got 1 items.
my network is Siamese - it has 2 inputs that both are fed into the same backbone:
input_shape = (image_size, image_size, 3)
left_input = tf.keras.layers.Input(shape=input_shape, name='left_input')
right_input = tf.keras.layers.Input(shape=input_shape, name='right_input')
# define base model:
general_input = tf.keras.layers.Input(shape=input_shape)
x = build_mobilenet(inputs=general_input) # builds a standart mobilenet model
backbone_model = tf.keras.Model(general_input, x)
# run both examples:
left_features = backbone_model(left_input)
right_features = backbone_model(right_input)
output = tf.keras.layers.Subtract(name='diff')([left_features, right_features])
# continue run some more actions over the output tensor...
during training my dataset object return a dictionary of input and the label: {'left_input': im_left, 'right_input': im_right}, label
when trying to qunatize the model I have a representative dataset objects that returns only the inputs (without the label): return {'left_input': left, 'right_input': right}.
the tflite code used for qunatization:
data_generator = DataProvider(num_images=10)
model = tf.keras.models.load_model(float32_model_path, compile=False)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
converter.representative_dataset = data_generator
tflite_model = converter.convert()
The error occurs when calling to converter.convert(). Does anyone understand what could be the issue?
Thanks!
I have this code for building a semantic search engine using pre-trained universal encoder from tensorflow hub. I am not able to convert to tlite. I have saved the model to my directory.
Importing the model:
module_path ="/content/drive/My Drive/4"
%time model = hub.load(module_path)
#print ("module %s loaded" % module_url)
#Create function for using modeltraining
def embed(input):
return model(input)
Training the model on data:
## training the model
Model_USE= embed(data)
Saving the model:
exported = tf.train.Checkpoint(v=tf.Variable(Model_USE))
exported.f = tf.function(
lambda x: exported.v * x,
input_signature=[tf.TensorSpec(shape=None, dtype=tf.float32)])
export_dir = "/content/drive/My Drive/"
tf.saved_model.save(exported,export_dir)
Saving works fine but when I convert to tflite it gives error.
Conversion code:
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
Error:
as_list() is not defined on an unknown TensorShape.
First, you should need to add a data generator to have representative inputs for the converter. Just like this:
def representative_data_gen():
for input_value in dataset.take(100):
yield [input_value]
The input value must be of shape (1, your_iput_shape) as if it had batch shape of 1. It has to be yielded as a list; mandatory.
You should also declare which type of optimization do you want, for example:
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
Nevertheless, I have also encountered problems with the different options of the converter depending on the network structure, which in this case I do not know. So, to make a clean run of the converter I would just do:
converter = lite.TFLiteConverter.from_keras_model(model)
converter.experimental_new_converter = True
converter.optimizations = [lite.Optimize.DEFAULT]
tfmodel = converter.convert()
The converter.experimental_new_converter = True is for problems when converting RNNs as in https://github.com/tensorflow/tensorflow/issues/34813
EDIT:
As explained here: ValueError: None is only supported in the 1st dimension. Tensor 'flatbuffer_data' has invalid shape '[None, None, 1, 512]' TFLite only allows the first dimension of your data to be None, that is, the batch. All other dimensions must be fixed. Try padding them with, for example, tf.keras.preprocessing.sequence.pad_sequences.
Then mask your sequences in the network as described in: tensorflow.org/guide/keras/masking_and_padding with Embedding or Masking layers.
I was wondering how to visualize the embeddings that are from a preloaded network in Tensorboard. I'm using FaceNet to create the embeddings for faces, I have already created the sprite.png and labels.tsv files. As for loading the network and setting up for Tensorboard this is what I have done so far:
1. Load the embedding layer
meta_file, ckpt_file = facenet.get_model_filenames(MODEL_DIR)
with tf.Graph().as_default():
with tf.Session().as_default() as sess:
# load the network
model_dir_exp = os.path.expanduser(MODEL_DIR)
saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file))
saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file))
# setup the lambda function needed to get the embeddings
images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
find_embeddings = lambda img : sess.run(embeddings, feed_dict = {images_placeholder : img, phase_train_placeholder : False})
2. Find the embeddings
face_embeddings = np.zeros((n_images,128))
face_embeddings = []
for i in range(n_batches):
start = i * batch_size
end = min((i + 1) * batch_size, n_images)
# Get the embeddings
face_embeddings[start:end, :] = find_embeddings(face_images[start:end])
3. Setup Tensorboard
from tensorflow.contrib.tensorboard.plugins import projector
embedding = tf.Variable(tf.zeros([33, 128]), name = "embedding")
config = projector.ProjectorConfig()
embedding_config = config.embeddings.add()
embedding_config.tensor_name = embedding.name
embedding_config.metadata_path = os.path.join(MODEL_DIR, 'labels.tsv')
embedding_config.sprite.image_path = os.path.join(MODEL_DIR,'sprite.png')
embedding_config.sprite.single_image_dim.extend([160, 160])
writer = tf.summary.FileWriter(MODEL_DIR)
projector.visualize_embeddings(writer, config)
Although when I load this in Tensorboard it says that it can't find the data. I've looked at the FAQ and when I run find MODEL_DIR | grep tfevents nothing shows up so I'm guessing that this is the problem. I've looked at the MNIST Tutorial and it seems that they have checkpoints during the training step although I don't have it as I'm using a pre-trained model. Any ideas as how I would make Tensorboard show the embeddings in this case?