I have trained an Object Detection model using the TensorFlow API by following the steps provided in this official tutorial. As such, by the end of the whole process, as described in the exporting step, I have got my model saved in the following format.
my_model/
├─ checkpoint/
├─ saved_model/
└─ pipeline.config
My question is, once the model has been saved to such a format, how can I load it and use it to make detections?
I am able to successfully do that with the training checkpoints by using the code below. And it is after that point (where I load the checkpoint that generated the best result) that I export the model.
# Load pipeline config and build a detection model
configs = config_util.get_configs_from_pipeline_file(PATH_TO_PIPELINE_CONFIG)
model_config = configs['model']
detection_model = model_builder.build(model_config=model_config, is_training=False)
# Restore checkpoint
ckpt = tf.compat.v2.train.Checkpoint(model=detection_model)
ckpt.restore(PATH_TO_CKPT).expect_partial()
However, in production, I am not looking to use those checkpoints. I am looking to load the model from the exported format.
I have tried the following command to load the exported model, but I have had no luck. It returns no errors and I can use the model variable below to make detections, but the output (bounding boxes, classes, scores) is incorrect, which leads me to believe there are some steps missing in the loading process.
model = tf.saved_model.load(path_to_exported_model)
Any tips?
Ok, as it turns out, the code is correct. I ran a test with another model (which is also an EfficientDet) and the code worked. It seems something went wrong when the original model was exported, which I am still trying to figure out.
To those looking for an answer, here's the full code for loading and using a saved model.
# Loading saved mode.
model = tf.saved_model.load(path_to_exported_model)
# Pre-processing image.
image = tf.image.decode_image(open(path_to_image, 'rb').read(), channels=3)
image = tf.expand_dims(image, 0)
image = tf.image.resize(image, (size_expected_by_model, size_expected_by_model))
# Model expects tf.uint8 tensor, but image is read as tf.float32.
image = tf.image.convert_image_dtype(image, tf.uint8)
# Executing object detection.
detections = model(image)
# Formatting returned detections.
num_detections = int(detections.pop('num_detections'))
detections = {key: value[0, :num_detections].numpy()
for key, value in detections.items()}
detections['num_detections'] = num_detections
detections['detection_classes'] = detections['detection_classes'].astype(np.int64)
Check this link.....Abdul Rehman has few python codes to run detection of saved_models to inference on Images as well as Videos......I use the codes extensively, to check on detection on saved_models from the TF2 Model Zoo, as well as trained models on custom datasets......
https://github.com/abdelrahman-gaber/tf2-object-detection-api-tutorial
Looking at what #Matheus Correia posted, I slightly modified his answer to suit what I was doing (in 2022) in google colabs.
category_index = your_generated_label_map
# e.g. category_index = {1: {'id': 1, 'name': 'tomato'}, 2: {'id': 2, 'name': 'egg'}, 3: {'id': 3, 'name': 'potato'}, 4: {'id': 4, 'name': 'broccoli'}, 5: {'id': 5, 'name': 'beef'}, 6: {'id': 6, 'name': 'chicken'}}
# set your own threshold here
Threshold = 0.5
def ExtractBBoxes(bboxes, bclasses, bscores, im_width, im_height):
bbox = []
class_labels = []
for idx in range(len(bboxes)):
if bscores[idx] >= Threshold:
y_min = int(bboxes[idx][0] * im_height)
x_min = int(bboxes[idx][1] * im_width)
y_max = int(bboxes[idx][2] * im_height)
x_max = int(bboxes[idx][3] * im_width)
class_label = category_index[int(bclasses[idx])]['name']
class_labels.append(class_label)
bbox.append([x_min, y_min, x_max, y_max, class_label, float(bscores[idx])])
return (bbox, class_labels)
# #Matheus Correia's code but modified
# Loading saved mode.
detect_fn = tf.saved_model.load("--saved model folder's path--")
# model = tf.saved_model.load("--saved model folder's path--")
# Pre-processing image.
image = tf.image.decode_image(open(IMAGE_PATH, 'rb').read(), channels=3)
image = tf.image.resize(image, (width,height))
im_height, im_width, _ = image.shape
# Model expects tf.uint8 tensor, but image is read as tf.float32.
input_tensor = np.expand_dims(image, 0)
detections = detect_fn(input_tensor)
bboxes = detections['detection_boxes'][0].numpy()
bclasses = detections['detection_classes'][0].numpy().astype(np.int32)
bscores = detections['detection_scores'][0].numpy()
det_boxes, class_labels = ExtractBBoxes(bboxes, bclasses, bscores, im_width, im_height)
print(class_labels)
Hope this helps.
Related
I try to convert and quantize a model trained with the Object Detection API v2 to run it on a Coral Devboard.
It seems like there is still a big problem with exporting Object Detection Models to lite, though I hope that maybe someone has some advice for me.
my converter looks like the following and I try to convert "SSD MobileNet v2 320x320" from Model Zoo v2
def convertModel(input_dir, output_dir, pipeline_config="", checkpoint:int=-1, quantization=False ):
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
files = os.listdir(input_dir)
if pipeline_config == "":
pipeline_config = [pipe for pipe in files if pipe.endswith(".config")][0]
pipeline_config_path = os.path.join(input_dir, pipeline_config)
# Find latest or given checkpoint
checkpoint_file = ""
checkpointDir = os.path.join(input_dir, 'checkpoint')
for chck in sorted(os.listdir(checkpointDir)):
if chck.endswith(".index"):
checkpoint_file = chck[:-6]
# Stop search when the requested was found
if chck.endswith(str(checkpoint)):
break
print("#####################################")
print(checkpoint_file)
print("#####################################")
#ckeckpint_file = [chck for chck in files if chck.endswith(f"{checkpoint}.meta")][0]
trained_checkpoint_prefix = os.path.join(checkpointDir, checkpoint_file)
configs = config_util.get_configs_from_pipeline_file(pipeline_config_path)
detection_model = model_builder.build(configs['model'], is_training=False)
ckpt = tf.compat.v2.train.Checkpoint(
model=detection_model)
ckpt.restore(trained_checkpoint_prefix).expect_partial()
class MyModel(tf.keras.Model):
def __init__(self, model):
super(MyModel, self).__init__()
self.model = model
self.seq = tf.keras.Sequential([
tf.keras.Input([300,300,3], 1),
])
def call(self, x):
x = self.seq(x)
images, shapes = self.model.preprocess(x)
prediction_dict = self.model.predict(images, shapes)
detections = self.model.postprocess(prediction_dict, shapes)
return detections
km = MyModel(detection_model)
y = km.predict(np.random.random((1,300,300,3)).astype(np.float32))
converter = tf.lite.TFLiteConverter.from_keras_model(km)
if quantization:
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.target_spec.supported_ops = [ tf.lite.OpsSet.SELECT_TF_OPS, tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.representative_dataset = _genDataset
else:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
converter.experimental_new_converter = True
converter.allow_custom_ops = True
tflite_model = converter.convert()
open(os.path.join(output_dir, 'model.tflite'), 'wb').write(tflite_model)
My Datagenerator loads about 100 images downloaded from the coco dataset to generate sample inputs
def _genDataset():
sampleDir = os.path.join("Dataset", "Coco")
for i in os.listdir(sampleDir):
image = cv2.imread(os.path.join(sampleDir, i))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (300,300))
image = image.astype("float")
image = np.expand_dims(image, axis=1)
image = image.reshape(1, 300, 300, 3)
yield [image.astype("float32")]
I tried to tun the code with TF2.2.0 which returned me
RuntimeError: Max and min for dynamic tensors should be recorded during calibration
according to an update to TF2.3.0 should help when then returns me
<unknown>:0: error: failed while converting: 'main': Ops that can be supported by the flex runtime (enabled via setting the -emit-select-tf-ops flag):
tf.Size {device = ""}
I also tested tf-nightly (2.4.0) which returns again
RuntimeError: Max and min for dynamic tensors should be recorded during calibration
Right now this tf.Size operator seems to be the reason why I can convert the model because when I allow custom operations I can convert it to tflite.
Sadly that is not a solution for me because the coral converter or my interpreter can't use the model with a missing custom op.
Does someone know if there is a possibility to remove this op in postprocessing or just ignore it during conversion?
Just converting it to TFlite without quantization and tf.lite.OpsSet.TFLITE_BUILTINS works without problems
I have a dataset of tfrecords that I'm trying to parse.
I am using this code to parse it:
image_size = [224,224]
def read_tfrecord(tf_record):
features = {
"filename": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
"fun": tf.io.FixedLenFeature([], tf.string),
"label": tf.io.VarLenFeature(tf.int64),
}
tf_record = tf.parse_single_example(tf_record, features)
filename = tf.image.decode_jpeg(tf_record['filename'], channels=3)
filename = tf.cast(filename, tf.float32) / 255.0 # convert image to floats in [0, 1] range
filename = tf.reshape(filename, [*image_size, 3]) # explicit size will be needed for TPU
label = tf.cast(tf_record['label'],tf.float32)
return filename, label
def load_dataset(filenames):
option_no_order = tf.data.Options()
option_no_order.experimental_deterministic = False
dataset = tf.data.Dataset.from_tensor_slices(filenames)
dataset = dataset.with_options(option_no_order)
#dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=16)
dataset = dataset.interleave(tf.data.TFRecordDataset, cycle_length=32, num_parallel_calls=AUTO) # faster
dataset = dataset.map(read_tfrecord, num_parallel_calls=AUTO)
return dataset
train_data=load_dataset(train_filenames)
val_data=load_dataset(val_filenames)
test_data=load_dataset(test_filenames)
After running this code I get:
train_data
<DatasetV1Adapter shapes: ((224, 224, 3), (?,)), types: (tf.float32, tf.float32)>
I was trying to see the images in the dataset with:
def display_9_images_from_dataset(dataset):
subplot=331
plt.figure(figsize=(13,13))
images, labels = dataset_to_numpy_util(dataset, 9)
for i, image in enumerate(images):
title = CLASSES[np.argmax(labels[i], axis=-1)]
subplot = display_one_flower(image, title, subplot)
if i >= 8:
break;
plt.tight_layout()
plt.subplots_adjust(wspace=0.1, hspace=0.1)
plt.show()
def dataset_to_numpy_util(dataset, N):
dataset = dataset.batch(N)
if tf.executing_eagerly():
# In eager mode, iterate in the Datset directly.
for images, labels in dataset:
numpy_images = images.numpy()
numpy_labels = labels.numpy()
break;
else: # In non-eager mode, must get the TF note that
# yields the nextitem and run it in a tf.Session.
get_next_item = dataset.make_one_shot_iterator().get_next()
with tf.Session() as ses:
numpy_images, numpy_labels = ses.run(get_next_item)
return numpy_images, numpy_labels
display_9_images_from_dataset(train_data)
But I get the error:
InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'B2.jpg'
[[{{node DecodeJpeg}}]]
[[IteratorGetNext_3]]
I'm a bit confused, one because it says that the file is jpg format and it asks for jpeg, which from my understanding are the same.
And also because I'm not sure how to view the images or even know if I parsed it correctly.
Extensions ".jpg" and ".jpeg" are different in terms of the validation check done by the API which is consuming it.
tf.image.decode_jpeg takes images with ".jpeg" extensions.
Try renaming your .jpg images with .jpeg extensions and it should start working.
I'm trying to prepare my custom Keras model for deploy to be used with Tensorflow Serving, but I'm running into issues with preprocessing my images.
When i train my model i use the following functions to preprocess my images:
def process_image_from_tf_example(self, image_str_tensor, n_channels=3):
image = tf.image.decode_image(image_str_tensor)
image.set_shape([256, 256, n_channels])
image = tf.cast(image, tf.float32) / 255.0
return image
def read_and_decode(self, serialized):
parsed_example = tf.parse_single_example(serialized=serialized, features=self.features)
input_image = self.process_image_from_tf_example(parsed_example["image_raw"], 3)
ground_truth_image = self.process_image_from_tf_example(parsed_example["gt_image_raw"], 1)
return input_image, ground_truth_image
My images are PNGs saved locally, and when i write them on the .tfrecord files i use
tf.gfile.GFile(str(image_path), 'rb').read()
This works, I'm able to train my model and use it for local predictions.
Now I want to deploy my model to be used with Tensorflow Serving. My serving_input_receiver_fn function looks like this:
def serving_input_receiver_fn(self):
input_ph = tf.placeholder(dtype=tf.string, shape=[None], name='image_bytes')
images_tensor = tf.map_fn(self.process_image_from_tf_example, input_ph, back_prop=False, dtype=tf.float32)
return tf.estimator.export.ServingInputReceiver({'input_1': images_tensor}, {'image_bytes': input_ph})
where process_image_from_tf_example is the same function as above, but i get the following error:
InvalidArgumentError (see above for traceback): assertion failed: [Unable to decode bytes as JPEG, PNG, GIF, or BMP]
Reading here it looks like this error is due to the fact that i'm not using
tf.gfile.GFile(str(image_path), 'rb').read()
as with my training/test files, but i can't use it because i need to send encoded bytes formatted as
{"image_bytes": {'b64': base64.b64encode(image).decode()}}
as requested by TF Serving.
Examples online send JPEG encoded bytes and preprocess the image starting with
tf.image.decode_jpeg(image_buffer, channels=3)
but if i use a different preprocessing function in my serving_input_receiver_fn (different than the one used for training) that starts with
tf.image.decode_png(image_buffer, channels=3)
i get the following error:
InvalidArgumentError (see above for traceback): Expected image (JPEG, PNG, or GIF), got unknown format starting with 'AAAAAAAAAAAAAAAA'
(the same happens with decode_jpeg, by the way)
What am i doing wrong? Do you need more code from me to answer? Thanks a lot!
Edit!!
Changed the title because it was not clear enough
OK I solved it.
image was a numpy array but i had to do the following:
buffer = cv2.imencode('.jpg', image)[1].tostring()
bytes_image = base64.b64encode(buffer).decode('ascii')
{"image_bytes": {"b64": bytes_image}}
Also, my preprocessing and serving_input_receiver_fn functions changed:
def process_image_from_buffer(self, image_buffer):
image = tf.image.decode_jpeg(image_buffer, channels=3)
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = tf.expand_dims(image, 0)
image = tf.image.resize_bilinear(image, [256, 256], align_corners=False)
image = tf.squeeze(image, [0])
image = tf.cast(image, tf.float32) / 255.0
return image
def serving_input_receiver_fn(self):
input_ph = tf.placeholder(dtype=tf.string, shape=[None])
images_tensor = tf.map_fn(self.process_image_from_buffer, input_ph, back_prop=False, dtype=tf.float32)
return tf.estimator.export.ServingInputReceiver({'input_1': images_tensor}, {'image_bytes': input_ph})
process_image_from_buffer is different than process_image_from_tf_example used above for training.
I also removed name='image_bytes' from input_ph above.
Hope it's clear enough to help someone else.
Excellent guide partially used for solving it
I want to use tensorflow's optimize_for_inference.py script on a frozen Model from the model zoo: the ssd_mobilenet_v1_coco.
How do i find/determine the names of the input and output name of the model?
Hires version of the graph generated by tensorboard
This question might help: Given a tensor flow model graph, how to find the input node and output node names (for me it did not)
I think you can do using the following code. I downloaded ssd_mobilenet_v1_coco frozen model from here and was able to get the input and output names as shown below
!pip install tensorflow==1.15.5
import tensorflow as tf
tf.__version__ # TF1.15.5
gf = tf.GraphDef()
m_file = open('/content/frozen_inference_graph.pb','rb')
gf.ParseFromString(m_file.read())
with open('somefile.txt', 'a') as the_file:
for n in gf.node:
the_file.write(n.name+'\n')
file = open('somefile.txt','r')
data = file.readlines()
print("output name = ")
print(data[len(data)-1])
print("Input name = ")
file.seek ( 0 )
print(file.readline())
Output is
output name =
detection_classes
Input name =
image_tensor
Please check the gist here.
all the models saved using tensorflow object detection api have image_tensor as the input node name.
Object detection model has 4 outputs:
num_detections : Predicts the number of detection for a given image
detection_classes: Number of classes that the model is trained on
detection_boxes : predicts (ymin, xmin, ymax, xmax) coordinates
detection_scores : predicts the confidence for each class, the class which has the highest prediction should be selected
code for saved_model inference
def load_image_into_numpy_array(path):
'Converts Image into numpy array'
img_data = tf.io.gfile.GFile(path, 'rb').read()
image = Image.open(BytesIO(img_data))
im_width, im_height = image.size
return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
# Load saved_model
model = tf.saved_model.load_model('custom_mode/saved_model',tags=none)
# Convert image into numpy array
numpy_image = load_image_into_numpy_array('Image_path')
# Expand dimensions
input_tensor = np.expand_dims(numpy_image, 0)
# Send image to the model
model_output = model(input_tensor)
# Use output_nodes to predict the outputs
num_detections = int(model_output.pop('num_detections'))
detections = {key: value[0, :num_detections].numpy()
for key, value in detections.items()}
detections['num_detections'] = num_detections
detections['detection_classes'] = detections['detection_classes'].astype(np.int64)
boxes = detections['detection_boxes']
scores = detections['detection_scores']
pred_class = detections['detection_classes']
you can just do model.summary() to see all the Layer names (and also their type). It is the first column.
I was wondering how to visualize the embeddings that are from a preloaded network in Tensorboard. I'm using FaceNet to create the embeddings for faces, I have already created the sprite.png and labels.tsv files. As for loading the network and setting up for Tensorboard this is what I have done so far:
1. Load the embedding layer
meta_file, ckpt_file = facenet.get_model_filenames(MODEL_DIR)
with tf.Graph().as_default():
with tf.Session().as_default() as sess:
# load the network
model_dir_exp = os.path.expanduser(MODEL_DIR)
saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file))
saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file))
# setup the lambda function needed to get the embeddings
images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
find_embeddings = lambda img : sess.run(embeddings, feed_dict = {images_placeholder : img, phase_train_placeholder : False})
2. Find the embeddings
face_embeddings = np.zeros((n_images,128))
face_embeddings = []
for i in range(n_batches):
start = i * batch_size
end = min((i + 1) * batch_size, n_images)
# Get the embeddings
face_embeddings[start:end, :] = find_embeddings(face_images[start:end])
3. Setup Tensorboard
from tensorflow.contrib.tensorboard.plugins import projector
embedding = tf.Variable(tf.zeros([33, 128]), name = "embedding")
config = projector.ProjectorConfig()
embedding_config = config.embeddings.add()
embedding_config.tensor_name = embedding.name
embedding_config.metadata_path = os.path.join(MODEL_DIR, 'labels.tsv')
embedding_config.sprite.image_path = os.path.join(MODEL_DIR,'sprite.png')
embedding_config.sprite.single_image_dim.extend([160, 160])
writer = tf.summary.FileWriter(MODEL_DIR)
projector.visualize_embeddings(writer, config)
Although when I load this in Tensorboard it says that it can't find the data. I've looked at the FAQ and when I run find MODEL_DIR | grep tfevents nothing shows up so I'm guessing that this is the problem. I've looked at the MNIST Tutorial and it seems that they have checkpoints during the training step although I don't have it as I'm using a pre-trained model. Any ideas as how I would make Tensorboard show the embeddings in this case?