Keras callbacks Tensorboard multi-output Error

Keras callbacks Tensorboard multi-output Error - tensorflow

I can't figure out what is causing this error for multi-output Keras Model when using callbacks.TensorBoard.
tbCallBack = keras.callbacks.TensorBoard(log_dir = logdir, histogram_freq = 1, write_graph = 1, write_images = 0, write_grads = 1)
###No errror when not using callbacks
regr.fit( Ax_train, [Ay_train_p, Ay_train_s], validation_data= (Ax_test, [Ay_test_p, Ay_test_s]), epochs = 500, batch_size = 10, verbose = 1)
###No errror when not using validation_data
regr.fit( Ax_train, [Ay_train_p, Ay_train_s], epochs = 500, batch_size = 10, verbose = 1, callbacks=[tbCallBack])
###Error Occurred
regr.fit( Ax_train, [Ay_train_p, Ay_train_s], validation_data= (Ax_test, [Ay_test_p, Ay_test_s]), epochs = 500, batch_size = 10, verbose = 1, callbacks=[tbCallBack])
Error
Epoch 1/500
1280/1663 [======================>.......] - ETA: 0s - loss: 1.6230 - output_power_loss: 0.9627 - output_slack_loss: 0.66032017-07-11 03:17:27.964542: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1] has negative dimensions
2017-07-11 03:17:27.964589: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1] has negative dimensions
[[Node: output_slack_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
2017-07-11 03:17:27.970690: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1] has negative dimensions
2017-07-11 03:17:27.970735: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1] has negative dimensions
[[Node: output_power_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
2017-07-11 03:17:27.972004: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1] has negative dimensions
2017-07-11 03:17:27.972026: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1] has negative dimensions
[[Node: output_power_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Traceback (most recent call last):
File "tf_keras.py", line 183, in <module>
regr.fit( Ax_train, [Ay_train_p, Ay_train_s], validation_data= (Ax_test, [Ay_test_p, Ay_test_s]), epochs = 500, batch_size = 10, verbose = 1, callbacks=[tbCallBack])
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/keras/engine/training.py", line 1507, in fit
initial_epoch=initial_epoch)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/keras/engine/training.py", line 1176, in _fit_loop
callbacks.on_epoch_end(epoch, epoch_logs)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/keras/callbacks.py", line 77, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/keras/callbacks.py", line 768, in on_epoch_end
result = self.sess.run([self.merged], feed_dict=feed_dict)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape [-1] has negative dimensions
[[Node: output_power_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op u'output_power_sample_weights', defined at:
File "tf_keras.py", line 180, in <module>
regr = nn_model()
File "tf_keras.py", line 177, in nn_model
model.compile(optimizer = 'adam', loss ={'output_power': 'mean_squared_error', 'output_slack': 'binary_crossentropy'})
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/keras/engine/training.py", line 870, in compile
name=name + '_sample_weights'))
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 431, in placeholder
x = tf.placeholder(dtype, shape=shape, name=name)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder
name=name)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Shape [-1] has negative dimensions
[[Node: output_power_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
What does "Shape [-1] has negative dimensions" means? I had also tried with each output with callbacks.Tensorboard and there wasn't any error occurred. Also search with "Node: output_power_sample_weights" but no results.
regr.summary()
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
main_input (InputLayer) (None, 5) 0
____________________________________________________________________________________________________
Hidden (Dense) (None, 5) 30 main_input[0][0]
____________________________________________________________________________________________________
output_power (Dense) (None, 1) 6 Hidden[0][0]
____________________________________________________________________________________________________
output_slack (Dense) (None, 1) 6 Hidden[0][0]
====================================================================================================
Total params: 42
Trainable params: 42
Non-trainable params: 0

The "Shape [-1] has negative dimensions" error message is a (misleading) error printed when you don't feed a tf.placeholder() that has a partially-defined shape. This error message has been replaced with a more informative message in the nightly version of TensorFlow.
The error and the stack trace suggests that this Session.run() call is failing (source):
File "/home/roy/keras_tf/local/lib/python2.7/site-packages/keras/callbacks.py", line 768, in on_epoch_end
result = self.sess.run([self.merged], feed_dict=feed_dict)
Looking at the implementation, it appears that this is only called when the TensorBoard callback is installed, and self.validation_data is not None. From the implementation, it looks like this callback is incompatible with models that depend on placeholders that are not one of (i) model.inputs, (ii) model.targets, or (iii) model.sample_weights. It might be best to open an issue on the Keras GitHub issue tracker about this.

Related

tensorflow.python.framework.errors_impl.InvalidArgumentError: Default MaxPoolingOp only supports NHWC on device type CPU

when i use keras VGG16 pre-trained model to get bottleneck feature in tf.dataset.map function, i got this problem
vgg16 = VGG16(include_top=False, input_shape=input_shape)
vgg16.trainable = False
def _parse_function(example_proto):
features = {'img': tf.FixedLenFeature((), tf.string, default_value=""),
'label': tf.FixedLenFeature((), tf.int64, default_value=0)}
parsed_features = tf.parse_single_example(example_proto, features)
img = tf.decode_raw(parsed_features['img'], tf.uint8)
img = tf.reshape(img, (224, 224, 3))
img = tf.cast(img, tf.float32) / 255.0
img = tf.expand_dims(img, 0)
img = vgg16(img)
img = tf.squeeze(img, [0])
label = parsed_features['label']
return img, label
ds.map(_parse_function)
the error print is
2019-06-19 13:47:19.378807: E tensorflow/core/common_runtime/executor.cc:624] Executor failed to create kernel. Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
[[{{node vgg16/block1_pool/MaxPool}}]]
2019-06-19 13:47:19.379204: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at iterator_ops.cc:660 : Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
[[{{node vgg16/block1_pool/MaxPool}}]]
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Default MaxPoolingOp only supports NHWC on device type CPU
[[{{node vgg16/block1_pool/MaxPool}}]]
[[{{node MakeIterator}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/tf/image_cls/train.py", line 103, in
steps_per_epoch=steps_per_epoch,
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 776, in fit
shuffle=shuffle)
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 2200, in _standardize_user_data
K.get_session().run(x.initializer)
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/opt/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Default MaxPoolingOp only supports NHWC on device type CPU
[[{{node vgg16/block1_pool/MaxPool}}]]
[[node MakeIterator (defined at /home/tf/image_cls/train.py:103) ]]

Custom metric: Using scikit learn's AucRoc Calculator with tf.keras

I'm training a multilabel classifier using tf.keras and horovod that has 14 classes. AucRoc is used as the metric to evaluate the performance of the classifier. I want to be able to use scikit learn's AucRoc calculator as mentioned here: How to compute Receiving Operating Characteristic (ROC) and AUC in keras?. If I feed the tensors as is for the following function:
def sci_auc_roc(y_true, y_pred):
return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)
I get an error that looks like this:
/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:265: UserWarning: The output shape of `ResNet50(include_top=False)` has been changed since Keras 2.2.0.
warnings.warn('The output shape of `ResNet50(include_top=False)` '
Traceback (most recent call last):
File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
main()
File "official_resnet_tf_1.12.0_auc.py", line 420, in main
model = chexnet_model(FLAGS)
File "official_resnet_tf_1.12.0_auc.py", line 375, in chexnet_model
metrics=[tf_auc_roc,sci_auc_roc])
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper
method(self, *args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 648, in compile
sample_weights=self.sample_weights)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 313, in _handle_metrics
output, output_mask))
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 270, in _handle_per_output_metrics
y_true, y_pred, weights=weights, mask=mask)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 598, in weighted
score_array = fn(y_true, y_pred)
File "official_resnet_tf_1.12.0_auc.py", line 327, in sci_auc_roc
return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 349, in roc_auc_score
y_type = type_of_target(y_true)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/sklearn/utils/multiclass.py", line 243, in type_of_target
'got %r' % y)
ValueError: Expected array-like (array or non-string sequence), got <tf.Tensor 'dense_target:0' shape=(?, ?) dtype=float32>
I'm trying to convert tf tensors into a numpy array and then feed them to the roc_auc_score method like so:
def sci_auc_roc(y_true, y_pred):
with tf.Session() as sess:
y_true, y_pred = sess.run([y_true, y_pred])
return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)
I get the following error:
warnings.warn('The output shape of `ResNet50(include_top=False)` '
Traceback (most recent call last):
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
[[{{node input_1}} = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
main()
File "official_resnet_tf_1.12.0_auc.py", line 420, in main
model = chexnet_model(FLAGS)
File "official_resnet_tf_1.12.0_auc.py", line 375, in chexnet_model
metrics=[tf_auc_roc,sci_auc_roc])
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper
method(self, *args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 648, in compile
sample_weights=self.sample_weights)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 313, in _handle_metrics
output, output_mask))
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 270, in _handle_per_output_metrics
y_true, y_pred, weights=weights, mask=mask)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 598, in weighted
score_array = fn(y_true, y_pred)
File "official_resnet_tf_1.12.0_auc.py", line 324, in sci_auc_roc
y_true, y_pred = sess.run([y_true, y_pred])
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
[[node input_1 (defined at /mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:214) = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'input_1', defined at:
File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
main()
File "official_resnet_tf_1.12.0_auc.py", line 420, in main
model = chexnet_model(FLAGS)
File "official_resnet_tf_1.12.0_auc.py", line 339, in chexnet_model
input_shape=(FLAGS.image_size, FLAGS.image_size, 3))
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/applications/__init__.py", line 70, in wrapper
return base_fun(*args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/applications/resnet50.py", line 32, in ResNet50
return resnet50.ResNet50(*args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py", line 214, in ResNet50
img_input = layers.Input(shape=input_shape)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py", line 229, in Input
input_tensor=tensor)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py", line 112, in __init__
name=self.name)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1747, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 5206, in placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
[[node input_1 (defined at /mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:214) = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[52342,1],0]
Exit code: 1
--------------------------------------------------------------------------
I've also tried tensorflow's https://www.tensorflow.org/api_docs/python/tf/metrics/auc like so:
def tf_auc_roc(y_true, y_pred):
auc = tf.metrics.auc(y_true, y_pred)[1]
K.get_session().run(tf.local_variables_initializer())
return auc
It works just fine. However, it gives me a single number for aucroc. I wonder what that number represents, is it an average aucroc value for all the 14 classes? or max aucscores of all the classes? or how does it get to a single number?
1216/1216 [==============================] - 413s 340ms/step - loss: 0.1513 - tf_auc_roc: 0.7944 - val_loss: 0.2212 - val_tf_auc_roc: 0.8074
Epoch 2/15
582/1216 [=============>................] - ETA: 3:16 - loss: 0.1459 - tf_auc_roc: 0.8053
1) How do I fix the error with roc_auc_score?
2) What does that single number represent?

I think that the result of a metric should be a single tensor value that represents the average of the results as described here in the Keras documentation (which I find is the better documentation than that from TensorFlow).
You could instead use a custom callback to achieve your desired result, most probably you would want to write to disc the result on_epoch_end

how to see tensor value of a layer output in keras

I have a Seq2Seq model. I am interested to print out the matrix value of the output of the encoder per iteration.
So for example as the dimension of the matrix in the encoder is (?,20) and the epoch =5 and in each epoch, there are 10 iteration,
I would like to see 10 matrix of the dimension (?,20) per epoch.
I have gone to several links as here but it still does not print out the value matrix.
With this code as mentioned in the aboved link:
import keras.backend as K
k_value = K.print_tensor(encoded)
print(k_value)
I got:
Tensor("Print:0", shape=(?, 20), dtype=float32)
Is there any straightforward way of showing the tensor value of each layer in Keras?
Update 1
by trying this code: K_value = K.eval(encoded) it raises this error:
Traceback (most recent call last):
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
return fn(*args)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input' with dtype float and shape [?,45,50]
[[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,45,50], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: encoder_lstm/add_16/_25 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_460_encoder_lstm/add_16", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sgnbx/Downloads/projects/LSTM_autoencoder/justfun.py", line 121, in <module>
k_value = K.eval(encoded)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 671, in eval
return to_dense(x).eval(session=get_session())
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 680, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 4951, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run
run_metadata)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input' with dtype float and shape [?,45,50]
[[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,45,50], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: encoder_lstm/add_16/_25 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_460_encoder_lstm/add_16", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'input', defined at:
File "/home/sgnbx/Downloads/projects/LSTM_autoencoder/justfun.py", line 113, in <module>
inputs = Input(shape=(SEQUENCE_LEN, EMBED_SIZE), name="input")
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/keras/engine/input_layer.py", line 177, in Input
input_tensor=tensor)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/keras/engine/input_layer.py", line 86, in __init__
name=self.name)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 515, in placeholder
x = tf.placeholder(dtype, shape=shape, name=name)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1735, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 4925, in placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1717, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input' with dtype float and shape [?,45,50]
[[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,45,50], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: encoder_lstm/add_16/_25 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_460_encoder_lstm/add_16", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7fd900525c50>>
Traceback (most recent call last):
File "/home/sgnbx/anaconda3/envs/py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 686, in __del__
TypeError: 'NoneType' object is not callable
Process finished with exit code 1

Very simple way to print a tensor :
from keras import backend as K
k_value = K.eval(tensor)
print(k_value)
UPDATE 1
Create a callback to print at the end of each epoch :
class callback(Callback):
def __init__(self, model, X_train):
self.model = model
self.x = X_train
def on_train_begin(self, logs={}):
return
def on_train_end(self, logs={}):
return
def on_epoch_begin(self, epoch, logs={}):
return
def on_epoch_end(self, epoch, logs={}):
inp = model.input # input placeholder
outputs = model.layers[N].output # get output of N's layer
functors = K.function([inp, K.learning_phase()], [outputs])
layer_outs = functors([self.x, 1.])
print('\r OUTPUT TENSOR : %s' % layer_outs)
return
def on_batch_begin(self, batch, logs={}):
return
def on_batch_end(self, batch, logs={}):
return
Call this function in your fit() method like that :
callbacks=[callback(model = model, X_train = X_train)])
Inspired from Keras, How to get the output of each layer?
Hope this will finally help you !

Keras Multi GPU example gives ResourceExhaustedError

So I try to use multiple GPUs with Keras. When I run training_utils.py with the example program (given as comments inside the training_utils.py code), I end up with ResourceExhaustedError. nvidia-smi tells me that barely one of the four GPUs are working. Using one GPU works fine for other programs.
TensorFlow 1.3.0
Keras 2.0.8
Ubuntu 16.04
CUDA/cuDNN 8.0/6.0
Question: Anyone have any idea whats going on here?
Console output:
(...)
2017-10-26 14:39:02.086838: W tensorflow/core/common_runtime/bfc_allocator.cc:277] ***************************************************************************************************x
2017-10-26 14:39:02.086857: W tensorflow/core/framework/op_kernel.cc:1192] Resource exhausted: OOM when allocating tensor with shape[128,55,55,256]
Traceback (most recent call last):
File "test.py", line 27, in
parallel_model.fit(x, y, epochs=20, batch_size=256)
File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/engine/training.py", line 1631, in fit
validation_steps=validation_steps)
File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/engine/training.py", line 1213, in _fit_loop
outs = f(ins_batch)
File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2331, in call
**self.session_kwargs)
File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[128,55,55,256]
[[Node: replica_1/xception/block3_sepconv2/separable_conv2d = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:1"](replica_1/xception/block3_sepconv2/separable_conv2d/depthwise, block3_sepconv2/pointwise_kernel/read/_2103)]]
[[Node: training/RMSprop/gradients/replica_0/xception/block10_sepconv2/separable_conv2d_grad/Conv2DBackpropFilter/_4511 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_25380_training/RMSprop/gradients/replica_0/xception/block10_sepconv2/separable_conv2d_grad/Conv2DBackpropFilter", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]
Caused by op u'replica_1/xception/block3_sepconv2/separable_conv2d',
defined at: File "test.py", line 19, in
parallel_model = multi_gpu_model(model, gpus=2) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/utils/training_utils.py",
line 143, in multi_gpu_model
outputs = model(inputs) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/engine/topology.py",
line 603, in call
output = self.call(inputs, **kwargs) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/engine/topology.py",
line 2061, in call
output_tensors, _, _ = self.run_internal_graph(inputs, masks) File
"/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/engine/topology.py",
line 2212, in run_internal_graph
output_tensors = _to_list(layer.call(computed_tensor, **kwargs)) File
"/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/layers/convolutional.py",
line 1221, in call
dilation_rate=self.dilation_rate) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py",
line 3279, in separable_conv2d
data_format=tf_data_format) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/nn_impl.py",
line 497, in separable_conv2d
name=name) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py",
line 397, in conv2d
data_format=data_format, name=name) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py",
line 767, in apply_op
op_def=op_def) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py",
line 2630, in create_op
original_op=self._default_original_op, op_def=op_def) File "/home/kyb/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py",
line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
ResourceExhaustedError (see above for traceback): OOM when allocating
tensor with shape[128,55,55,256] [[Node:
replica_1/xception/block3_sepconv2/separable_conv2d =
Conv2D[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1,
1, 1], use_cudnn_on_gpu=true,
_device="/job:localhost/replica:0/task:0/gpu:1"](replica_1/xception/block3_sepconv2/separable_conv2d/depthwise,
block3_sepconv2/pointwise_kernel/read/_2103)]] [[Node:
training/RMSprop/gradients/replica_0/xception/block10_sepconv2/separable_conv2d_grad/Conv2DBackpropFilter/_4511
= _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0",
send_device="/job:localhost/replica:0/task:0/gpu:0",
send_device_incarnation=1,
tensor_name="edge_25380_training/RMSprop/gradients/replica_0/xception/block10_sepconv2/separable_conv2d_grad/Conv2DBackpropFilter",
tensor_type=DT_FLOAT,
_device="/job:localhost/replica:0/task:0/cpu:0"]]
EDIT (Added example code):
import tensorflow as tf
from keras.applications import Xception
from keras.utils import multi_gpu_model
import numpy as np
num_samples = 1000
height = 224
width = 224
num_classes = 100
with tf.device('/cpu:0'):
model = Xception(weights=None,
input_shape=(height, width, 3),
classes=num_classes)
parallel_model = multi_gpu_model(model, gpus=4)
parallel_model.compile(loss='categorical_crossentropy',
optimizer='rmsprop')
x = np.random.random((num_samples, height, width, 3))
y = np.random.random((num_samples, num_classes))
parallel_model.fit(x, y, epochs=20, batch_size=128)

When encountering OOM/ResourceExhaustedError on GPU I believe changing (Reducing) batch size is the right option to try at first.
For different GPU you may need different batch size based on the GPU
memory you have.
Recently I faced the similar type of problem, tweaked a lot to do the different type of experiment.
Here is the link to the question (also some tricks are included).
However, while reducing the size of the batch you may find that your training gets slower.

Using keras with tensorflow "You must feed a value for placeholder tensor 'input_1' with dtype float"

I get the unexpected error "You must feed a value for placeholder tensor 'input_1' with dtype float" when training the discriminator of a GAN
here the error:
W tensorflow/core/framework/op_kernel.cc:975] Invalid argument: You must feed a value for placeholder tensor 'input_1' with dtype float
[[Node: input_1 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
W tensorflow/core/framework/op_kernel.cc:975] Invalid argument: You must feed a value for placeholder tensor 'input_1' with dtype float
[[Node: input_1 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Traceback (most recent call last):
File "new_model.py", line 204, in <module>
main()
File "new_model.py", line 201, in main
train(nb_epoch=10, BATCH_SIZE=5)
File "new_model.py", line 176, in train
d_loss = discriminator.train_on_batch(image_to_dis, label_to_dis)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 766, in train_on_batch
class_weight=class_weight)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1320, in train_on_batch
outputs = self.train_function(ins)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1943, in __call__
feed_dict=feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float
[[Node: input_1 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
[[Node: moments_4/sufficient_statistics/Shape/_217 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1267_moments_4/sufficient_statistics/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op u'input_1', defined at:
File "new_model.py", line 204, in <module>
main()
File "new_model.py", line 201, in main
train(nb_epoch=10, BATCH_SIZE=5)
File "new_model.py", line 134, in train
transformer0 = transform_model()
File "new_model.py", line 22, in transform_model
inputs = Input(shape=( 128, 128, 3))
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1198, in Input
input_tensor=tensor)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1116, in __init__
name=self.name)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 321, in placeholder
x = tf.placeholder(dtype, shape=shape, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1587, in placeholder
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 2043, in _placeholder
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1' with dtype float
[[Node: input_1 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
[[Node: moments_4/sufficient_statistics/Shape/_217 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1267_moments_4/sufficient_statistics/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
it seems the error happens at
d_loss = discriminator.train_on_batch(image_to_dis, label_to_dis)
I'm sure image_to_dis and label_to_dis fit the input of dicriminator
however, the error message here
Caused by op u'input_1', defined at:
File "new_model.py", line 204, in <module>
main()
File "new_model.py", line 201, in main
train(nb_epoch=10, BATCH_SIZE=5)
File "new_model.py", line 134, in train
transformer0 = transform_model()
File "new_model.py", line 22, in transform_model
inputs = Input(shape=( 128, 128, 3))
it says the error is caused by the input tensor of 'transformer'(it is the generator in this GAN).
my code contains something like 'transformer_with_discriminator = discriminator(transformer)', but the discriminator is compiled without the transformer. I think training the discriminator has nothing to do with the input of 'transformer0'
the whole script is a little long, may I put the link of my model here?
https://github.com/wkcw/keras-face-attribute/blob/master/model%26train.py
image_to_dis.dtype and label_to_dis.dtype are both float32, and I've tried to convert label_to_dis.dtype to int
I really have no idea about this......

It comes from the batchnormalization. You can see here : https://stackoverflow.com/a/42470757/7137636 how to fix this issue.
If you need more info, ask in comments :)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Keras callbacks Tensorboard multi-output Error - tensorflow

Related

tensorflow.python.framework.errors_impl.InvalidArgumentError: Default MaxPoolingOp only supports NHWC on device type CPU

Custom metric: Using scikit learn's AucRoc Calculator with tf.keras

how to see tensor value of a layer output in keras

Keras Multi GPU example gives ResourceExhaustedError

Using keras with tensorflow "You must feed a value for placeholder tensor 'input_1' with dtype float"

Categories

Resources