Multiple outputs in Keras gives value error - tensorflow

I am implementing a modification of the U-net for semantic segmentation.
I have two outputs from the network :
model = Model(input=inputs, output= [conv10, dense3])
model.compile(optimizer=Adam(lr=1e-5), loss=common_loss, metrics=[common_loss])
where common loss is defined as :
def common_loss(y_true, y_pred):
segmentation_loss = categorical_crossentropy(y_true[0], y_pred[0])
classifiction_loss = categorical_crossentropy(y_true[1], y_pred[1])
return segmentation_loss + alpha * classifiction_loss
When I run this I get an value error as:
File "y-net.py", line 138, in <module>
train_and_predict()
File "y-net.py", line 133, in train_and_predict
callbacks=[model_checkpoint], validation_data=(X_val, [y_img_val, y_class_val]))
File "/home/gpu_users/meetshah/miniconda2/envs/check/lib/python2.7/site-packages/keras/engine/training.py", line 1124, in fit
callback_metrics=callback_metrics)
File "/home/gpu_users/meetshah/miniconda2/envs/check/lib/python2.7/site-packages/keras/engine/training.py", line 848, in _fit_loop
callbacks.on_batch_end(batch_index, batch_logs)
File "/home/gpu_users/meetshah/miniconda2/envs/check/lib/python2.7/site-packages/keras/callbacks.py", line 63, in on_batch_end
callback.on_batch_end(batch, logs)
File "/home/gpu_users/meetshah/miniconda2/envs/check/lib/python2.7/site-packages/keras/callbacks.py", line 191, in on_batch_end
self.progbar.update(self.seen, self.log_values)
File "/home/gpu_users/meetshah/miniconda2/envs/check/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 147, in update
if abs(avg) > 1e-3:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
My implementation and the entire trace can be found here :
https://gist.github.com/meetshah1995/19d54270e8d1b20f814e6c1495facc6a

You can see how to implement multiple metrics with multiple outputs here: https://github.com/EdwardTyantov/ultrasound-nerve-segmentation/blob/master/u_model.py.
model.compile(optimizer=optimizer,
loss={'main_output': dice_coef_loss, 'aux_output': 'binary_crossentropy'},
metrics={'main_output': dice_coef, 'aux_output': 'acc'},
loss_weights={'main_output': 1., 'aux_output': 0.5})
I am not sure, if combined output metrics are supported yet.

Related

Argument must be a string or a number, not 'ExponentialDecay'

I am on Tensorflow 2.4.0, and tried to perform Exponential decay on the learning rate as follows:
learning_rate_scheduler = tf.keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=0.1, decay_steps=1000, decay_rate=0.97, staircase=False)
and start the learning rate of my optimizer with such decay method:
optimizer_to_use = Adam(learning_rate=learning_rate_scheduler)
the model is compiled as follows
model.compile(loss=metrics.contrastive_loss, optimizer=optimizer_to_use, metrics=[accuracy])
The train goes well until the third epoch, where the following error is showed:
File "train_contrastive_siamese_network_inception.py", line 163, in run_experiment
history = model.fit([pairTrain[:, 0], pairTrain[:, 1]], labelTrain[:], validation_data=([pairTest[:, 0], pairTest[:, 1]], labelTest[:]), batch_size=config.BATCH_SIZE, epochs=config.EPOCHS, callbacks=callbacks)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1145, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/callbacks.py", line 432, in on_epoch_end
callback.on_epoch_end(epoch, numpy_logs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/callbacks.py", line 2542, in on_epoch_end
old_lr = float(K.get_value(self.model.optimizer.lr))
TypeError: float() argument must be a string or a number, not 'ExponentialDecay'
I checked this issue was even raised in the official keras Forum, but no success even there. Plus, the documentation clearly states that:
A LearningRateSchedule instance can be passed in as the learning_rate argument of any optimizer.
What could be the issue?

Calling `Model.predict` in graph mode is not supported when the `Model` instance was constructed with eager mode enabled

So I just followed someone project and make it to here when I got this error:
[2020-10-12 15:33:21,128] ERROR in app: Exception on /predict/ [POST]
Traceback (most recent call last):
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\flask\app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\flask\app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\flask\app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\flask\_compat.py", line 39, in reraise
raise value
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\flask\app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\flask\app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "D:\Ngoding Python\Skripsi\deploy\app.py", line 70, in predict
out = model.predict(img)
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\training.py", line 130, in _method_wrapper
return method(self, *args, **kwargs)
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1562, in predict
version_utils.disallow_legacy_graph('Model', 'predict')
File "c:\users\mr777\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\utils\version_utils.py", line 122, in disallow_legacy_graph
raise ValueError(error_msg)
ValueError: Calling `Model.predict` in graph mode is not supported when the `Model` instance was constructed with eager mode enabled. Please construct your `Model` instance in graph mode or call `Model.predict` with eager mode enabled.
Here's the code I wrote:
with graph.as_default():
# perform the prediction
out = model.predict(img)
print(out)
print(class_names[np.argmax(out)])
# convert the response to a string
response = class_names[np.argmax(out)]
return str(response)
any idea with this? because I found the same question here
The answer is simple, just load your model inside the graph just like this:
with graph.as_default():
json_file = open('models/model.json','r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
#load weights into new model
loaded_model.load_weights("models/model.h5")
print("Loaded Model from disk")
#compile and evaluate loaded model
loaded_model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
# perform the prediction
out = loaded_model.predict(img)
print(out)
print(class_names[np.argmax(out)])
# convert the response to a string
response = class_names[np.argmax(out)]
return str(response)
#Ilham: Try to wrap the call method in a tf.function, right after defining your network. Something like this:
model = Sequential()
model.call = tf.function(model.call)
I had an issue similar to yours. I solved it just by adding that second line of code.
See the following link for more details: https://www.tensorflow.org/guide/intro_to_graphs

TPUEstimator error -- AttributeError: module 'tensorflow.contrib.tpu.python.ops.tpu_ops' has no attribute 'cross_replica_sum'

I have written a tensorflow code using the TPUEstimator, but I am having problems running it in use_tpu=False mode. I would like to run it on my local computer to make sure that all the operations are TPU-compatible. The code works fine with the normal Estimator. Here is my master code:
import logging
from tensorflow.contrib.tpu.python.tpu import tpu_config, tpu_estimator, tpu_optimizer
from tensorflow.contrib.cluster_resolver import TPUClusterResolver
from capser_7_model_fn import *
from capser_7_input_fn import *
import subprocess
from absl import flags
flags.DEFINE_bool(
'use_tpu', False,
'Use TPUs rather than plain CPUs')
tf.flags.DEFINE_string(
"tpu", default='$TPU_NAME',
help="The Cloud TPU to use for training. This should be either the name "
"used when creating the Cloud TPU, or a grpc://ip.address.of.tpu:8470 "
"url.")
tf.flags.DEFINE_string("model_dir", LOGDIR, "Estimator model_dir")
flags.DEFINE_integer(
'save_checkpoints_secs', 1000,
'Interval (in seconds) at which the model data '
'should be checkpointed. Set to 0 to disable.')
flags.DEFINE_integer(
'save_summary_steps', 100,
'Number of steps which must have run before showing summaries.')
tf.flags.DEFINE_integer("iterations", 1000,
"Number of iterations per TPU training loop.")
tf.flags.DEFINE_integer("num_shards", 8, "Number of shards (TPU chips).")
tf.flags.DEFINE_integer("batch_size", 1024,
"Mini-batch size for the training. Note that this "
"is the global batch size and not the per-shard batch.")
FLAGS = tf.flags.FLAGS
if FLAGS.use_tpu:
my_project_name = subprocess.check_output(['gcloud', 'config', 'get-value', 'project'])
my_zone = subprocess.check_output(['gcloud', 'config', 'get-value', 'compute/zone'])
cluster_resolver = TPUClusterResolver(
tpu=[FLAGS.tpu],
zone=my_zone,
project=my_project_name)
master = TPUClusterResolver(tpu=[os.environ['TPU_NAME']]).get_master()
else:
master = ''
my_tpu_run_config = tpu_config.RunConfig(
master=master,
model_dir=FLAGS.model_dir,
save_checkpoints_secs=FLAGS.save_checkpoints_secs,
save_summary_steps=FLAGS.save_summary_steps,
session_config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True),
tpu_config=tpu_config.TPUConfig(iterations_per_loop=FLAGS.iterations, num_shards=FLAGS.num_shards),
)
# create estimator for model (the model is described in capser_7_model_fn)
capser = tpu_estimator.TPUEstimator(model_fn=model_fn_tpu,
config=my_tpu_run_config,
use_tpu=FLAGS.use_tpu,
train_batch_size=batch_size,
params={'model_batch_size': batch_size_per_shard})
# train model
logging.getLogger().setLevel(logging.INFO) # to show info about training progress
capser.train(input_fn=train_input_fn_tpu, steps=n_steps)
I have a capsule network defined in model_fn_tpu, which returns the TPUEstimator spec. The optimizer is a standard AdamOptimizer. I have made all the changes explained here https://www.tensorflow.org/guide/using_tpu#optimizer to make my code compatible with TPUEstimator. I get the following error:
Traceback (most recent call last):
File "C:/Users/doerig/PycharmProjects/capser/TPU_playground.py", line 85, in <module>
capser.train(input_fn=train_input_fn_tpu, steps=n_steps)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 363, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 843, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 856, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 831, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2016, in _model_fn
features, labels, is_export_mode=is_export_mode)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1121, in call_without_tpu
return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1317, in _call_model_fn
estimator_spec = self._model_fn(features=features, **kwargs)
File "C:\Users\doerig\PycharmProjects\capser\capser_7_model_fn.py", line 101, in model_fn_tpu
**output_decoder_deconv_params)
File "C:\Users\doerig\PycharmProjects\capser\capser_model.py", line 341, in capser_model
loss_training_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step(), name="training_op")
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\training\optimizer.py", line 424, in minimize
name=name)
File "C:\Users\doerig\AppData\Local\Continuum\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_optimizer.py", line 113, in apply_gradients
summed_grads_and_vars.append((tpu_ops.cross_replica_sum(grad), var))
AttributeError: module 'tensorflow.contrib.tpu.python.ops.tpu_ops' has no attribute 'cross_replica_sum'
Any ideas to solve this problem? Thank you in advance!
I suspect this is either a bug in the version of TensorFlow you are using + Windows, or else an issue with your build of TensorFlow.
For example, when I chase down the file tensorflow\contrib\tpu\python\tpu\tpu_optimizer.py in the TF 1.4 branch, I see that tpu_ops is imported as:
from tensorflow.contrib.tpu.python.ops import tpu_ops
and if you chase that to the relevant file, you see:
if platform.system() != "Windows":
# pylint: disable=wildcard-import,unused-import,g-import-not-at-top
from tensorflow.contrib.tpu.ops.gen_tpu_ops import *
from tensorflow.contrib.util import loader
from tensorflow.python.platform import resource_loader
# pylint: enable=wildcard-import,unused-import,g-import-not-at-top
_tpu_ops = loader.load_op_library(
resource_loader.get_path_to_datafile("_tpu_ops.so"))
else:
# We have already built the appropriate libraries into the binary via CMake
# if we have built contrib, so we don't need this
pass
Following up with the other TF branches that existed at the time of this posting, we see similar comments in 1.5, in 1.6, in 1.7, in 1.8, and in 1.9.
I strongly suspect this would not occur under Linux, but I might test this later and edit this answer.

tensorflow - ValueError: The shape for decoder/while/Merge_12:0 is not an invariant for the loop

I use tf.contrib.seq2seq.dynamic_decode for decoder training
prediction, final_decoder_state, _ = dynamic_decode(
custom_decoder
)
with custom decoder
custom_decoder = CustomDecoder(decoder_cell, helper, decoder_init_state)
and helper
helper = CustomTrainingHelper(batch_size, targets, stop_targets,
num_outs, outputs_per_step, 1.0, False)
And dynamic_decoder raises error
Traceback (most recent call last):
File "E:/tasks/text_to_speech/tts/tf_seq2seq.py", line 95, in <module>
custom_decoder
File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow\contrib\seq2seq\python\ops\decoder.py", line 304, in dynamic_decode
swap_memory=swap_memory)
File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3224, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2956, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2930, in _BuildLoop
next_vars.append(_AddNextAndBackEdge(m, v))
File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 688, in _AddNextAndBackEdge
_EnforceShapeInvariant(m, v)
File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 632, in _EnforceShapeInvariant
(merge_var.name, m_shape, n_shape))
ValueError: The shape for decoder/while/Merge_12:0 is not an invariant for the loop. It enters the loop with shape (10, 1), but has shape (?, 1) after one iteration. Provide shape invariants using either the `shape_invariants` argument of tf.while_loop or set_shape() on the loop variables.
batch_size is equal to 10. As I understand the issue is in tf.while_loop and batch_size. In what way it is possible to fix this error? Thanks in advance.
Your provided too little information to say anything specific. Please follow (https://stackoverflow.com/help/mcve) in the future.
In general, this error is telling you the following. By default TensorFlow checks that the variables passed from one iteration of the while loop to the next one don't change shape. In your case, the decoder/while/Merge_12:0 tensor originally had a shape of (10, 1) but after one iteration it became (?, 1) meaning that tensorflow can no longer infer the size of the first dimension.
If you know that the first dimension is really 10, you can use Tensor.set_shape to tell this to TensorFlow.

Computing Edit Distance (feed_dict error)

I've written some code in Tensorflow to compute the edit-distance between one string and a set of strings. I can't figure out the error.
import tensorflow as tf
sess = tf.Session()
# Create input data
test_string = ['foo']
ref_strings = ['food', 'bar']
def create_sparse_vec(word_list):
num_words = len(word_list)
indices = [[xi, 0, yi] for xi,x in enumerate(word_list) for yi,y in enumerate(x)]
chars = list(''.join(word_list))
return(tf.SparseTensor(indices, chars, [num_words,1,1]))
test_string_sparse = create_sparse_vec(test_string*len(ref_strings))
ref_string_sparse = create_sparse_vec(ref_strings)
sess.run(tf.edit_distance(test_string_sparse, ref_string_sparse, normalize=True))
This code works and when run, it produces the output:
array([[ 0.25],
[ 1. ]], dtype=float32)
But when I attempt to do this by feeding the sparse tensors in through sparse placeholders, I get an error.
test_input = tf.sparse_placeholder(dtype=tf.string)
ref_input = tf.sparse_placeholder(dtype=tf.string)
edit_distances = tf.edit_distance(test_input, ref_input, normalize=True)
feed_dict = {test_input: test_string_sparse,
ref_input: ref_string_sparse}
sess.run(edit_distances, feed_dict=feed_dict)
Here is the error traceback:
Traceback (most recent call last):
File "<ipython-input-29-4e06de0b7af3>", line 1, in <module>
sess.run(edit_distances, feed_dict=feed_dict)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 372, in run
run_metadata_ptr)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 597, in _run
for subfeed, subfeed_val in _feed_fn(feed, feed_val):
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 558, in _feed_fn
return feed_fn(feed, feed_val)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 268, in <lambda>
[feed.indices, feed.values, feed.shape], feed_val)),
TypeError: zip argument #2 must support iteration
Any idea what is going on here?
TL;DR: For the return type of create_sparse_vec(), use tf.SparseTensorValue instead of tf.SparseTensor.
The problem here comes from the return type of create_sparse_vec(), which is tf.SparseTensor, and is not understood as a feed value in the call to sess.run().
When you feed a (dense) tf.Tensor, the expected value type is a NumPy array (or certain objects that can be converted to an array). When you feed a tf.SparseTensor, the expected value type is a tf.SparseTensorValue, which is similar to a tf.SparseTensor but its indices, values, and shape properties are NumPy arrays (or certain objects that can be converted to arrays, like the lists in your example.
The following code should work:
def create_sparse_vec(word_list):
num_words = len(word_list)
indices = [[xi, 0, yi] for xi,x in enumerate(word_list) for yi,y in enumerate(x)]
chars = list(''.join(word_list))
return tf.SparseTensorValue(indices, chars, [num_words,1,1])