Related
Setting
As already mentioned in the title, I got a problem with my custom loss function, when trying to load the saved model. My loss looks as follows:
def weighted_cross_entropy(weights):
weights = K.variable(weights)
def loss(y_true, y_pred):
y_pred = K.clip(y_pred, K.epsilon(), 1-K.epsilon())
loss = y_true * K.log(y_pred) * weights
loss = -K.sum(loss, -1)
return loss
return loss
weighted_loss = weighted_cross_entropy([0.1,0.9])
So during training, I used the weighted_loss function as loss function and everything worked well. When training is finished I save the model as .h5file with the standard model.save function from keras API.
Problem
When I am trying to load the model via
model = load_model(path,custom_objects={"weighted_loss":weighted_loss})
I am getting a ValueError telling me that the loss is unknown.
Error
The error message looks as follows:
File "...\predict.py", line 29, in my_script
"weighted_loss": weighted_loss})
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\saving.py", line 312, in _deserialize_model
sample_weight_mode=sample_weight_mode)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\training.py", line 139, in compile
loss_function = losses.get(loss)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\losses.py", line 133, in get
return deserialize(identifier)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\losses.py", line 114, in deserialize
printable_module_name='loss function')
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\utils\generic_utils.py", line 165, in deserialize_keras_object
':' + function_name)
ValueError: Unknown loss function:loss
Questions
How can I fix this problem? May it be possible that the reason for that is my wrapped loss definition? So keras doesn't know, how to handle the weights variable?
Your loss function's name is loss (i.e. def loss(y_true, y_pred):). Therefore, when loading back the model you need to specify 'loss' as its name:
model = load_model(path, custom_objects={'loss': weighted_loss})
For full examples demonstrating saving and loading Keras models with custom loss functions or models, please have a look at the following GitHub gist files:
Custom loss function defined using a wrapper:
https://gist.github.com/ashkan-abbasi66/a81fe4c4d588e2c187180d5bae734fde
Custom loss function defined by subclassing:
https://gist.github.com/ashkan-abbasi66/327efe2dffcf9788847d26de934ef7bd
Custom model:
https://gist.github.com/ashkan-abbasi66/d5a525d33600b220fa7b095f7762cb5b
Note:
I tested the above examples on Python 3.8 with Tensorflow 2.5.
Setting
As already mentioned in the title, I got a problem with my custom loss function, when trying to load the saved model. My loss looks as follows:
def weighted_cross_entropy(weights):
weights = K.variable(weights)
def loss(y_true, y_pred):
y_pred = K.clip(y_pred, K.epsilon(), 1-K.epsilon())
loss = y_true * K.log(y_pred) * weights
loss = -K.sum(loss, -1)
return loss
return loss
weighted_loss = weighted_cross_entropy([0.1,0.9])
So during training, I used the weighted_loss function as loss function and everything worked well. When training is finished I save the model as .h5file with the standard model.save function from keras API.
Problem
When I am trying to load the model via
model = load_model(path,custom_objects={"weighted_loss":weighted_loss})
I am getting a ValueError telling me that the loss is unknown.
Error
The error message looks as follows:
File "...\predict.py", line 29, in my_script
"weighted_loss": weighted_loss})
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\saving.py", line 312, in _deserialize_model
sample_weight_mode=sample_weight_mode)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\engine\training.py", line 139, in compile
loss_function = losses.get(loss)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\losses.py", line 133, in get
return deserialize(identifier)
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\losses.py", line 114, in deserialize
printable_module_name='loss function')
File "...\Continuum\anaconda3\envs\processing\lib\site-packages\keras\utils\generic_utils.py", line 165, in deserialize_keras_object
':' + function_name)
ValueError: Unknown loss function:loss
Questions
How can I fix this problem? May it be possible that the reason for that is my wrapped loss definition? So keras doesn't know, how to handle the weights variable?
Your loss function's name is loss (i.e. def loss(y_true, y_pred):). Therefore, when loading back the model you need to specify 'loss' as its name:
model = load_model(path, custom_objects={'loss': weighted_loss})
For full examples demonstrating saving and loading Keras models with custom loss functions or models, please have a look at the following GitHub gist files:
Custom loss function defined using a wrapper:
https://gist.github.com/ashkan-abbasi66/a81fe4c4d588e2c187180d5bae734fde
Custom loss function defined by subclassing:
https://gist.github.com/ashkan-abbasi66/327efe2dffcf9788847d26de934ef7bd
Custom model:
https://gist.github.com/ashkan-abbasi66/d5a525d33600b220fa7b095f7762cb5b
Note:
I tested the above examples on Python 3.8 with Tensorflow 2.5.
I've been trying to save a Yolo v3 model and then Load it back from from an h5 file.
When saving I use the checkpoint (ModelCheckpoint) to save the model (with the parameter save_weights_only
set to False in order to save the WHOLE model).
However, when I tried to recover the same model by using the keras load_model function, I initially get a yolo_head function not found error.
I then tried to add the function as a parameter to the load function as in:
{"yolo_head":yolo_head}
Now, the issue becomes: "TypeError: list indices must be integers or slices, not list" because somehow, there's an error in the loss function (yolo_loss, line 444) when loaded dynamically.
Apparently, the binary code of the loss function is hard copied into the h5 file.
My question is this:
Is there a better/simpler YOLO loss function that I can use THAT DOES NOT refer to other functions or can be easily reloaded?
Thanks in advance,
EDIT 1: Additional Code Snippets,
Keras Checkpoint Callback definition:
checkpoint = ModelCheckpoint(
os.path.join(log_dir, "checkpoint.h5"),
monitor="val_loss",
save_weights_only=False,
save_best_only=True,
period=1,
)
Checkpoint added to model training:
history = model.fit_generator(
data_generator_wrapper(
lines[:num_train], batch_size, input_shape, anchors, num_classes
),
steps_per_epoch=max(1, num_train // batch_size),
validation_data=data_generator_wrapper(
lines[num_train:], batch_size, input_shape, anchors, num_classes
),
validation_steps=max(1, num_val // batch_size),
epochs=epoch1,
initial_epoch=0,
callbacks=[logging, checkpoint],
)
Trying to load the same file 'checkpoint.h5' after pre-training ended:
weights_path = os.path.join(log_dir, "checkpoint.h5")
model = load_model(weights_path, {"yolo_head":yolo_head, "tf":tf, "box_iou":box_iou,'<lambda>': lambda y_true, y_pred: y_pred})
Here is the error stack trace:
File "2_Training/Train_YOLO.py", line 206, in
model = load_model(weights_path, {"yolo_head":yolo_head, "tf":tf,
"box_iou":box_iou,'': lambda y_true, y_pred: y_pred})
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/engine/saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/engine/saving.py", line 225, in _deserialize_model
model = model_from_config(model_config, custom_objects=custom_objects)
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/engine/saving.py", line 458, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/layers/init.py", line 55, in deserialize
printable_module_name='layer')
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/utils/generic_utils.py", line 145, in deserialize_keras_object
list(custom_objects.items())))
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/engine/network.py", line 1032, in from_config
process_node(layer, node_data)
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/engine/network.py", line 991, in process_node
layer(unpack_singleton(input_tensors), **kwargs)
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/engine/base_layer.py", line 457, in call
output = self.call(inputs, **kwargs)
File "/Users/nkwedi/.pyenv/versions/3.7.5/lib/python3.7/site-packages/keras/layers/core.py", line 687, in call
return self.function(inputs, **arguments)
File "/Users/nkwedi/Documents/MyProjects/Eroscope/EyeDetectionYOLO/2_Training/src/keras_yolo3/yolo3/model.py", line 444, in yolo_loss
anchors[anchor_mask[l]],
TypeError: list indices must be integers or slices, not list
The Solution for me was to use a Cloud Based Training Platform like Google Collab.
Here's a link to a workable Collab Notebook with GPU enabled:
YOLO v3 Google Collab Tutorial
I've trained my YOLOv3 network with darknet to recognize some obj in an image. All is OK.
I want to use weight file in iOS app, so following some tutorials I obtained keras h5 model from darknet weight file ancd I check that also h5 model works fine. OK.
Last step, using coremltools I tried to convert h5 model in coreml model usable under xcode.
Here I have the issue... last conversion is performed with this little py script:
import coremltools
....
coreml_model = coremltools.converters.keras.convert('yolorcgz.h5', input_names='image', class_labels=output_labels, image_input_names='image', input_name_shape_dict={'image': [1, 416, 416, 3]})
coreml_model.input_description['image'] = 'Takes a photo'
coreml_model.output_description['output'] = 'Prediction of obj in the photo'
coreml_model.author = 'SW Team'
coreml_model.license = 'Public Domain'
coreml_model.short_description = "YOLOv3 network trained for obj recognition"
coreml_model.save('yolorcgz.mlmodel')
when I run the script I've always have this error:
Traceback (most recent call last):
File "coreml.py", line 9, in <module>
coreml_model = coremltools.converters.keras.convert('yolorcgz.h5',input_name_shape_dict={'input1': [1, 416, 416, 3]})
File "/usr/local/lib/python2.7/dist-packages/coremltools/converters/keras/_keras_converter.py", line 760, in convert
custom_conversion_functions=custom_conversion_functions)
File "/usr/local/lib/python2.7/dist-packages/coremltools/converters/keras/_keras_converter.py", line 556, in convertToSpec
custom_objects=custom_objects)
File "/usr/local/lib/python2.7/dist-packages/coremltools/converters/keras/_keras2_converter.py", line 305, in _convert
raise ValueError(errMsg)
ValueError: Invalid input shape for image.
Please provide a finite height (H), width (W) & channel value (C) using input_name_shape_dict arg with key = 'image' and value = [None,H,W,C]
Converted .mlmodel can be modified to have flexible input shape using coremltools.models.neural_network.flexible_shape_utils
Any ideas on wht could goes wrong?
Thank's a lot
I was having this same issue. I fixed it by changing input_name_shape_dict={'image': [1, 416, 416, 3]} to input_name_shape_dict={'image': [None, 416, 416, 3]}.
I am using Keras 2.3.1 and Tensorflow 1.14.0 and this worked for me.
I am trying to convert a saved model tf to a tflite mode. I am getting following error in tensorflow c++ while conversion:-
tensorflow/contrib/lite/toco/tooling_util.cc:981] Check failed: name.substr(colon_pos + 1).find_first_not_of("0123456789") == string::npos (1 vs. 18446744073709551615)Array name must only have digits after colon\n'
Here is the source code :-
def convert_to_tflite_util(model_dir, output_dir):
tag_constants = set(["train"])
converter = tf.contrib.lite.TocoConverter.from_saved_model(model_dir, tag_set=tag_constants)
#converter = tf.contrib.lite.TocoConverter.from_saved_model(model_dir)
tflite_model = converter.convert()
Logs :-
2018-11-09 11:44:11.410085: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Restoring parameters from /Users/deosingh/code/mldev/sensei-on-device/converters/deeplasso_converter/model/variables/variables
INFO:tensorflow:The given SavedModel MetaGraphDef contains SignatureDefs with the following keys: {'serving_default'}
INFO:tensorflow:input tensors info:
INFO:tensorflow:Tensor's key in saved_model's tensor_map: input
INFO:tensorflow: tensor name: data:0, shape: (-1, 320, 320, 4), type: DT_FLOAT
INFO:tensorflow:output tensors info:
INFO:tensorflow:Tensor's key in saved_model's tensor_map: output
INFO:tensorflow: tensor name: prob:0, shape: (1, 320, 320, 2), type: DT_FLOAT
INFO:tensorflow:Restoring parameters from /Users/deosingh/code/mldev/model/variables/variables
INFO:tensorflow:Froze 387 variables.
INFO:tensorflow:Converted 387 variables to const ops.
Traceback (most recent call last):
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/deosingh/code/mldev/src/tf_to_tflite.py", line 112, in <module>
main()
File "/Users/deosingh/code/mldev/src/tf_to_tflite.py", line 108, in main
convert_to_tflite_util(args.src_dir, args.dest_dir)
File "/Users/deosingh/code/mldev/src/tf_to_tflite.py", line 91, in convert_to_tflite_util
tflite_model = converter.convert()
File "/Users/deosingh/code/mldev/venv/lib/python3.6/site-packages/tensorflow/contrib/lite/python/lite.py", line 439, in convert
**converter_kwargs)
File "/Users/deosingh/code/mldev/venv/lib/python3.6/site-packages/tensorflow/contrib/lite/python/convert.py", line 309, in toco_convert_impl
input_data.SerializeToString())
File "/Users/deosingh/code/mldev/venv/lib/python3.6/site-packages/tensorflow/contrib/lite/python/convert.py", line 109, in toco_convert_protos
(stdout, stderr))
RuntimeError: TOCO failed see console for info.
b'2018-11-09 11:44:18.328373: F tensorflow/contrib/lite/toco/tooling_util.cc:981] Check failed: name.substr(colon_pos + 1).find_first_not_of("0123456789") == string::npos (1 vs. 18446744073709551615)Array name must only have digits after colon\n'
None
Link to tensorflow source which is throwing this error.
The original model was in caffe , then was converted to tensorflow using mmdnn . Converter is complaining about name not being in the correct format i.e doesn't expect _ after : as per the tf source. Please let me know how to resolve/debug this error. Since the tf source which isthrowing this error is in c++ and I am invoking it through python I am not able to debug it through pycharm on macosx.