multi-label classification error based on tf.slim - tensorflow

When build tfrecord, I have encoded the labels as N-hot encoding.
Then I comment the following code:
labels = slim.one_hot_encoding(
labels, dataset.num_classes - FLAGS.labels_offset)
But it crash when train:
Traceback (most recent call last):
File "../train_image_classifier_multilabel.py", line 620, in <module>
tf.app.run()
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "../train_image_classifier_multilabel.py", line 519, in main
clones = model_deploy.create_clones(deploy_config, clone_fn, [batch_queue])
File "/root/Projects/scene-tf/slim/deployment/model_deploy.py", line 193, in create_clones
outputs = model_fn(*args, **kwargs)
File "../train_image_classifier_multilabel.py", line 513, in clone_fn
label_smoothing=FLAGS.label_smoothing, weights=1.0)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/losses/losses_impl.py", line 676, in sigmoid_cross_entropy
logits.get_shape().assert_is_compatible_with(multi_class_labels.get_shape())
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py", line 764, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (32, 63) and (32,) are incompatible
The sigmoid_cross_entropy encounters problem:
tf.losses.sigmoid_cross_entropy(
logits=logits, multi_class_labels=labels,
label_smoothing=FLAGS.label_smoothing, weights=1.0)
Obviously, the shape of logits is (32, 63) and labels is (32,) .
It seems that the comment of one_hot_encoding is wrong.
But I don't know how to fix it.

Related

Custom metric: Using scikit learn's AucRoc Calculator with tf.keras

I'm training a multilabel classifier using tf.keras and horovod that has 14 classes. AucRoc is used as the metric to evaluate the performance of the classifier. I want to be able to use scikit learn's AucRoc calculator as mentioned here: How to compute Receiving Operating Characteristic (ROC) and AUC in keras?. If I feed the tensors as is for the following function:
def sci_auc_roc(y_true, y_pred):
return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)
I get an error that looks like this:
/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:265: UserWarning: The output shape of `ResNet50(include_top=False)` has been changed since Keras 2.2.0.
warnings.warn('The output shape of `ResNet50(include_top=False)` '
Traceback (most recent call last):
File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
main()
File "official_resnet_tf_1.12.0_auc.py", line 420, in main
model = chexnet_model(FLAGS)
File "official_resnet_tf_1.12.0_auc.py", line 375, in chexnet_model
metrics=[tf_auc_roc,sci_auc_roc])
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper
method(self, *args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 648, in compile
sample_weights=self.sample_weights)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 313, in _handle_metrics
output, output_mask))
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 270, in _handle_per_output_metrics
y_true, y_pred, weights=weights, mask=mask)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 598, in weighted
score_array = fn(y_true, y_pred)
File "official_resnet_tf_1.12.0_auc.py", line 327, in sci_auc_roc
return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 349, in roc_auc_score
y_type = type_of_target(y_true)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/sklearn/utils/multiclass.py", line 243, in type_of_target
'got %r' % y)
ValueError: Expected array-like (array or non-string sequence), got <tf.Tensor 'dense_target:0' shape=(?, ?) dtype=float32>
I'm trying to convert tf tensors into a numpy array and then feed them to the roc_auc_score method like so:
def sci_auc_roc(y_true, y_pred):
with tf.Session() as sess:
y_true, y_pred = sess.run([y_true, y_pred])
return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)
I get the following error:
warnings.warn('The output shape of `ResNet50(include_top=False)` '
Traceback (most recent call last):
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
[[{{node input_1}} = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
main()
File "official_resnet_tf_1.12.0_auc.py", line 420, in main
model = chexnet_model(FLAGS)
File "official_resnet_tf_1.12.0_auc.py", line 375, in chexnet_model
metrics=[tf_auc_roc,sci_auc_roc])
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper
method(self, *args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 648, in compile
sample_weights=self.sample_weights)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 313, in _handle_metrics
output, output_mask))
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 270, in _handle_per_output_metrics
y_true, y_pred, weights=weights, mask=mask)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 598, in weighted
score_array = fn(y_true, y_pred)
File "official_resnet_tf_1.12.0_auc.py", line 324, in sci_auc_roc
y_true, y_pred = sess.run([y_true, y_pred])
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
[[node input_1 (defined at /mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:214) = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'input_1', defined at:
File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
main()
File "official_resnet_tf_1.12.0_auc.py", line 420, in main
model = chexnet_model(FLAGS)
File "official_resnet_tf_1.12.0_auc.py", line 339, in chexnet_model
input_shape=(FLAGS.image_size, FLAGS.image_size, 3))
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/applications/__init__.py", line 70, in wrapper
return base_fun(*args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/applications/resnet50.py", line 32, in ResNet50
return resnet50.ResNet50(*args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py", line 214, in ResNet50
img_input = layers.Input(shape=input_shape)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py", line 229, in Input
input_tensor=tensor)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py", line 112, in __init__
name=self.name)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1747, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 5206, in placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
[[node input_1 (defined at /mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:214) = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[52342,1],0]
Exit code: 1
--------------------------------------------------------------------------
I've also tried tensorflow's https://www.tensorflow.org/api_docs/python/tf/metrics/auc like so:
def tf_auc_roc(y_true, y_pred):
auc = tf.metrics.auc(y_true, y_pred)[1]
K.get_session().run(tf.local_variables_initializer())
return auc
It works just fine. However, it gives me a single number for aucroc. I wonder what that number represents, is it an average aucroc value for all the 14 classes? or max aucscores of all the classes? or how does it get to a single number?
1216/1216 [==============================] - 413s 340ms/step - loss: 0.1513 - tf_auc_roc: 0.7944 - val_loss: 0.2212 - val_tf_auc_roc: 0.8074
Epoch 2/15
582/1216 [=============>................] - ETA: 3:16 - loss: 0.1459 - tf_auc_roc: 0.8053
1) How do I fix the error with roc_auc_score?
2) What does that single number represent?
I think that the result of a metric should be a single tensor value that represents the average of the results as described here in the Keras documentation (which I find is the better documentation than that from TensorFlow).
You could instead use a custom callback to achieve your desired result, most probably you would want to write to disc the result on_epoch_end

Error while adding Conv1D layer

I am trying to train on a data containing sequences of 43 records of 3-dimensional vectors. While trying to add this Conv1D layer here:
model = Sequential()
model.add(Conv1D(input_shape=(43, 3),
filters=16,
kernel_size=4,
padding='same')) # This is line 24 of bcl_model_builder.py
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.5))
I am getting the following error. And I got no clue what went wrong here:
Traceback (most recent call last):
File "/home/shx/programs/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1585, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/shx/programs/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1015, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/shx/programs/pycharm-community-2017.1.3/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/shx/PycharmProjects/FBS/bcl/bcl_train_model.py", line 34, in <module>
model = mb.build_model()
File "/home/shx/PycharmProjects/FBS/bcl/bcl_model_builder.py", line 24, in build_model
padding='same'))
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/keras/models.py", line 442, in add
layer(x)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/keras/engine/topology.py", line 602, in __call__
output = self.call(inputs, **kwargs)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/keras/layers/convolutional.py", line 156, in call
dilation_rate=self.dilation_rate[0])
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 3124, in conv1d
data_format=tf_data_format)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 672, in convolution
op=op)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 338, in with_space_to_batch
return op(input, num_spatial_dims, padding)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 664, in op
name=name)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 116, in _non_atrous_convolution
name=scope)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 2013, in conv1d
data_format=data_format)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 397, in conv2d
data_format=data_format, name=name)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 589, in apply_op
param_name=input_name)
File "/home/shx/pyenvs/finainpy_env1/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'input' has DataType float64 not in list of allowed values: float16, float32
After giving so much thought on float64 in the error log, I realized that keras itself was configured that way in my machine
{
"floatx": "float64",
"epsilon": 1e-07,
"backend": "tensorflow",
"image_data_format": "channels_last"
}
I just had to update the property floatx to float32.

Tensorboard exception with summary.image of shape [-1, 125, 128, 1] of MFCCs

Following this guide, I'm converting a tensor [batch_size, 16000, 1] to an MFCC using the method described in the link:
def gen_spectrogram(wav, sr=16000):
# A 1024-point STFT with frames of 64 ms and 75% overlap.
stfts = tf.contrib.signal.stft(wav, frame_length=1024, frame_step=256, fft_length=1024)
spectrograms = tf.abs(stfts)
# Warp the linear scale spectrograms into the mel-scale.
num_spectrogram_bins = stfts.shape[-1].value
lower_edge_hertz, upper_edge_hertz, num_mel_bins = 80.0, 7600.0, 80
linear_to_mel_weight_matrix = tf.contrib.signal.linear_to_mel_weight_matrix(
num_mel_bins, num_spectrogram_bins,
sample_rate, lower_edge_hertz, upper_edge_hertz)
mel_spectrograms = tf.tensordot(spectrograms, linear_to_mel_weight_matrix, 1)
mel_spectrograms.set_shape(
spectrograms.shape[:-1].concatenate(
linear_to_mel_weight_matrix.shape[-1:]
)
)
# Compute a stabilized log to get log-magnitude mel-scale spectrograms.
log_mel_spectrograms = tf.log(mel_spectrograms + 1e-6)
# Compute MFCCs from log_mel_spectrograms and take the first 13.
return tf.contrib.signal.mfccs_from_log_mel_spectrograms(log_mel_spectrograms)[..., :13]
I then reshape the output of that to [batch_size, 125, 128, 1]. If I send that to a tf.layers.conv2d, things seem to work fine. However, if I try to tf.summary.image, I get the following error:
print(spec)
// => Tensor("spectrogram/Reshape:0", shape=(?, 125, 128, 1), dtype=float32)
tf.summary.image('spec', spec)
Caused by op u'spectrogram/stft/rfft', defined at:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/rsilveira/rnd/ml-engine/trainer/flatv1.py", line 103, in <module>
runner.run(model_fn)
File "trainer/runner.py", line 88, in run
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "/Library/Python/2.7/site-packages/tensorflow/python/estimator/training.py", line 432, in train_and_evaluate
executor.run_local()
File "/Library/Python/2.7/site-packages/tensorflow/python/estimator/training.py", line 611, in run_local
hooks=train_hooks)
File "/Library/Python/2.7/site-packages/tensorflow/python/estimator/estimator.py", line 302, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/Library/Python/2.7/site-packages/tensorflow/python/estimator/estimator.py", line 711, in _train_model
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/Library/Python/2.7/site-packages/tensorflow/python/estimator/estimator.py", line 694, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/Users/rsilveira/rnd/ml-engine/trainer/flatv1.py", line 53, in model_fn
spec = gen_spectrogram(x)
File "/Users/rsilveira/rnd/ml-engine/trainer/flatv1.py", line 22, in gen_spectrogram
step,
File "/Library/Python/2.7/site-packages/tensorflow/contrib/signal/python/ops/spectral_ops.py", line 91, in stft
return spectral_ops.rfft(framed_signals, [fft_length])
File "/Library/Python/2.7/site-packages/tensorflow/python/ops/spectral_ops.py", line 136, in _rfft
return fft_fn(input_tensor, fft_length, name)
File "/Library/Python/2.7/site-packages/tensorflow/python/ops/gen_spectral_ops.py", line 619, in rfft
"RFFT", input=input, fft_length=fft_length, name=name)
File "/Library/Python/2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/Library/Python/2.7/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/Library/Python/2.7/site-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Input dimension 4 must have length of at least 512 but got: 320
Not sure where to start troubleshooting this. What am I missing here?

Visualizing the inputs to convolution neural networks using keras and tensorflow as backend

There is an in-detailed post on keras blog.
But when compiling the code I get the error as follows:
Using TensorFlow backend.
Traceback (most recent call last):
File "visulaize_cifar.py", line 24, in <module>
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/models.py", line 332, in add
output_tensor = layer(self.outputs[0])
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 572, in __call__
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 635, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 166, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/layers/pooling.py", line 160, in call
dim_ordering=self.dim_ordering)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/layers/pooling.py", line 210, in _pooling_function
pool_mode='max')
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2866, in pool2d
x = tf.nn.max_pool(x, pool_size, strides, padding=padding)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 1617, in max_pool
name=name)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1598, in _max_pool
data_format=data_format, name=name)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2242, in create_op
set_shapes_for_outputs(ret)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1617, in set_shapes_for_outputs
shapes = shape_func(op)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1568, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 675, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool_1' (op: 'MaxPool') with input shapes: [1,1,64,128].
This error goes when I set dim_ordering='th'.
But as I am using tensorflow backend so dimension ordering should be dim_ordering='tf'.
Even after setting dim_ordering as 'th', I get error while loading weights from vgg16_weights.h5 as follows :
Traceback (most recent call last):
File "visulaize_cifar.py", line 67, in <module>
model.layers[k].set_weights(weights)
File "/home/dude_perf3ct/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 985, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (3, 3, 128, 64) not compatible with provided weight shape (64, 3, 3, 3).
As detailed in this post about 'th' and 'tf'. The above error implies layer weights are in 'tf' (but I set it to 'th' to avoid first error) and provided weight shape in 'th' ordering.
What seems to be the error?
Answer to this question was pretty simple. As, I was using tensorflow as backend. So, to convert I inserted the line
if K.backend()=='tensorflow':
K.set_image_dim_ordering("th")
after from keras import backend as K.
This is because the vgg16_weights.h5 has th format and also cifar10.load_data().

Adding to vocab during Tensorflow seq2seq run

I am using the Tensorflow seq2seq tutorial to play with machine translation. Say I have trained the model for some time and determine that I want to supplement the original vocab with new words to enhance the quality of the model. Is there a way to pause training, add words to the vocabulary, and then resume training from the most recent checkpoint? I attempted to do so but when I began training again I got this error:
Traceback (most recent call last):
File "execute.py", line 405, in <module>
train()
File "execute.py", line 127, in train
model = create_model(sess, False)
File "execute.py", line 108, in create_model
model.saver.restore(session, ckpt.model_checkpoint_path)
File "/home/jrthom18/.local/lib/python2.7/site- packages/tensorflow/python/training/saver.py", line 1388, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [384633] rhs shape= [384617]
[[Node: save/Assign_82 = Assign[T=DT_FLOAT, _class=["loc:#proj_b"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](proj_b, save/RestoreV2_82)]]
Caused by op u'save/Assign_82', defined at:
File "execute.py", line 405, in <module>
train()
File "execute.py", line 127, in train
model = create_model(sess, False)
File "execute.py", line 99, in create_model
model = seq2seq_model.Seq2SeqModel( gConfig['enc_vocab_size'], gConfig['dec_vocab_size'], _buckets, gConfig['layer_size'], gConfig['num_layers'], gConfig['max_gradient_norm'], gConfig['batch_size'], gConfig['learning_rate'], gConfig['learning_rate_decay_factor'], forward_only=forward_only)
File "/home/jrthom18/data/3x256_bs32/easy_seq2seq/seq2seq_model.py", line 166, in __init__
self.saver = tf.train.Saver(tf.global_variables(), keep_checkpoint_every_n_hours=2.0)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1000, in __init__
self.build()
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1030, in build
restore_sequentially=self._restore_sequentially)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 624, in build
restore_sequentially, reshape)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 373, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 130, in restore
self.op.get_shape().is_fully_defined())
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
use_locking=use_locking, name=name)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/jrthom18/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [384633] rhs shape= [384617]
[[Node: save/Assign_82 = Assign[T=DT_FLOAT, _class=["loc:#proj_b"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](proj_b, save/RestoreV2_82)]]
Obviously the new vocab is larger and so the tensor sizes do not match. Is there some way around this?
You can not update your vocab once its set but you can always use shared wordpiece model. It will help you to directly copy out of vocab words form source to target output.