Unimplemented Error: TensorArray has size zero - tensorflow

I am getting this weird error when trying to train a sequence to sequence model in tensorflow. The sequence to sequence model is a video captioning system. I have encoded the frames of the videos in sequence features of the SequenceExampleProto. After I prefetch the features containing the list of jpeg encoded strings, I decode them using the following function:
video = tf.map_fn(lambda x: tf.image.decode_jpeg(x, channels=3), encoded_video, dtype=tf.uint8)
The model compiles but during training time, I'm getting the following error which is caused by this code. The error says that the TensorArray is zero, whereas here the TensorArray should not be zero. Any help is appreciated:
tensorflow.python.framework.errors_impl.UnimplementedError: TensorArray has size zero, but element shape [?,?,3] is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays.
[[Node: input_fn/decode/map/TensorArrayStack/TensorArrayGatherV3 = TensorArrayGatherV3[_class=["loc:#input_fn/decode/map/TensorArray_1"], dtype=DT_UINT8, element_shape=[?,?,3], _device="/job:localhost/replica:0/task:0/cpu:0"](input_fn/decode/map/TensorArray_1, input_fn/decode/map/TensorArrayStack/range, input_fn/decode/map/while/Exit_1/_479)]]
Caused by op u'input_fn/decode/map/TensorArrayStack/TensorArrayGatherV3', defined at:
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/ASLNet/seq2seq/bin/train.py", line 277, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/ubuntu/ASLNet/seq2seq/bin/train.py", line 272, in main
schedule=FLAGS.schedule)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 111, in run
return _execute_schedule(experiment, schedule)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 46, in _execute_schedule
return task()
File "seq2seq/contrib/experiment.py", line 104, in continuous_train_and_eval
monitors=self._train_monitors)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 281, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 430, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 925, in _train_model
features, labels = input_fn()
File "seq2seq/training/utils.py", line 274, in input_fn
frame_format="jpeg")
File "seq2seq/training/utils.py", line 365, in process_video
video = tf.map_fn(lambda x: tf.image.decode_jpeg(x, channels=3), encoded_video, dtype=tf.uint8)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/functional_ops.py", line 390, in map_fn
results_flat = [r.stack() for r in r_a]
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/tensor_array_ops.py", line 301, in stack
return self.gather(math_ops.range(0, self.size()), name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/tensor_array_ops.py", line 328, in gather
element_shape=element_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2244, in _tensor_array_gather_v3
element_shape=element_shape, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
self._traceback = _extract_stack()
UnimplementedError (see above for traceback): TensorArray has size zero, but element shape [?,?,3] is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays.
[[Node: input_fn/decode/map/TensorArrayStack/TensorArrayGatherV3 = TensorArrayGatherV3[_class=["loc:#input_fn/decode/map/TensorArray_1"], dtype=DT_UINT8, element_shape=[?,?,3], _device="/job:localhost/replica:0/task:0/cpu:0"](input_fn/decode/map/TensorArray_1, input_fn/decode/map/TensorArrayStack/range, input_fn/decode/map/while/Exit_1/_479)]]

Fixed. I followed the suggestion from tensorflow map_fn TensorArray has inconsistent shapes and implemented the following:
with tf.name_scope("decode", values=[encoded_video]):
input_jpeg_strings = tf.TensorArray(tf.string, video_length)
input_jpeg_strings = input_jpeg_strings.unstack(encoded_video)
init_array = tf.TensorArray(tf.float32, size=video_length)
def cond(i, ta):
return tf.less(i, video_length)
def body(i, ta):
image = input_jpeg_strings.read(i)
image = tf.image.decode_jpeg(image, 3, name='decode_image')
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
assert (resize_height > 0) == (resize_width > 0)
image = tf.image.resize_images(image, size=[resize_height, resize_width], method=tf.image.ResizeMethod.BILINEAR)
return i + 1, ta.write(i, image)
_, input_image = tf.while_loop(cond, body, [0, init_array])

Related

How to finalize a model in Keras

I have a straightforward model developed in Keras:
....
model = Model(input, output)
model.compile(optimizer='adam', loss='categorical_crossentropy')
graph = tf.compat.v1.get_default_graph()
graph.finalize()
history = model.fit(X, y, epochs=30)
Since I'm dealing with some memory leak problems, it seemed like a good idea to finalize the graph to prevent the mentioned issue. But when I do, I get an exception RuntimeError: Graph is finalized and cannot be modified.:
Traceback (most recent call last):
File "./train.py", line 43, in <module>
history = model.fit(X, y, epochs=30)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 780, in fit
steps_name='steps_per_epoch')
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py", line 157, in model_iteration
f = _make_execution_function(model, mode)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py", line 532, in _make_execution_function
return model._make_execution_function(mode)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 2276, in _make_execution_function
self._make_train_function()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 2212, in _make_train_function
if not isinstance(K.symbolic_learning_phase(), int):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py", line 299, in symbolic_learning_phase
False, shape=(), name='keras_learning_phase')
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 2159, in placeholder_with_default
return gen_array_ops.placeholder_with_default(input, shape, name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 6406, in placeholder_with_default
"PlaceholderWithDefault", input=input, shape=shape, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 527, in _apply_op_helper
preferred_dtype=default_dtype)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1224, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 305, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 246, in constant
allow_broadcast=True)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 290, in _constant_impl
name=name).outputs[0]
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3588, in create_op
self._check_not_finalized()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3225, in _check_not_finalized
raise RuntimeError("Graph is finalized and cannot be modified.")
RuntimeError: Graph is finalized and cannot be modified.
There's nothing custom in this model, all the layers are from Keras library. And I'm using Tensorflow 1.14 and the Keras that comes with it (tensorflow.keras).
My question is, what are my options here? How can I pinpoint the reason for graph change? Or maybe I'm finalizing the graph wrong!?
[UPDATE]
To make sure that the problem is not with my setup and model, I followed the example provided in Tensorflow docs (follow the Colab link). I just added the two lines of code to finalize the graph just before calling the fit method. And I faced the same error. So my question stands, how do you finalize a model in Keras?

Save Model for Serving but "ValueError: Both labels and logits must be provided." when trying to export model

I wanted to save a model to do some predictions on specific pictures. Here is my serving function:
def _serving_input_receiver_fn():
# Note: only handles one image at a time
feat = tf.placeholder(tf.float32, shape=[None, 120, 50, 1])
return tf.estimator.export.TensorServingInputReceiver(features=feat, receiver_tensors=feat)
and here is where I export the model:
export_dir_base = os.path.join(FLAGS.model_dir, 'export')
export_dir = estimator.export_savedmodel(
export_dir_base, _serving_input_receiver_fn)
But I get the following error:
ValueError: Both labels and logits must be provided.
Now this Error I don't understand since the Serving stuff should just create a placeholder so I can later put some images through the placeholder to make predictions on the saved model?
Here is the whole traceback:
Traceback (most recent call last):
File "/home/cezary/models/official/mnist/mnist_tpu.py", line 222, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/cezary/models/official/mnist/mnist_tpu.py", line 206, in main
export_dir_base, _serving_input_receiver_fn)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 650, in export_savedmodel
mode=model_fn_lib.ModeKeys.PREDICT)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 703, in _export_saved_model_for_mode
strip_default_attrs=strip_default_attrs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 811, in _export_all_saved_models
mode=model_fn_lib.ModeKeys.PREDICT)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1971, in _add_meta_graph_for_mode
mode=mode)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 879, in _add_meta_graph_for_mode
config=self.config)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1992, in _call_model_fn
features, labels, mode, config)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1107, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2203, in _model_fn
features, labels, is_export_mode=is_export_mode)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1131, in call_without_tpu
return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1337, in _call_model_fn
estimator_spec = self._model_fn(features=features, **kwargs)
File "/home/cezary/models/official/mnist/mnist_tpu.py", line 95, in model_fn
cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_impl.py", line 156, in sigmoid_cross_entropy_with_logits
labels, logits)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1777, in _ensure_xent_args
raise ValueError("Both labels and logits must be provided.")
ValueError: Both labels and logits must be provided.
Nevermind the mnist naming, I just used the structure of the code, but didn't rename it.
Thanks for any help!
(I can't comment with a brand new account.) I was able to replicate your error by setting features and receiver_tensors to have the same value, but I don't think that your __serving_input_receiver_fn is implemented correctly. Can you follow the example here?

no kernel image is available for execution on the device

I training maskrcnn ,use tf-1.2 can train, but I use tf-1.5 it not training
The error is as follows:
Caused by op u'pyramid_1/AssignGTBoxes/Where_6', defined at:
File "/home/zhouzd2/letrain/applications/letrain.py", line 349, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "/home/zhouzd2/letrain/applications/letrain.py", line 346, in main
LeTrain().model_train(user_mode)
File "/home/zhouzd2/letrain/platform/base_train.py", line 1228, in model_train
cluster=self.cluster_spec)
File "/home/zhouzd2/letrain/platform/deployment/model_deploy.py", line 226, in create_clones
outputs, feed_ops,verify_model_loss = model_fn(*args, **kwargs)
File "/home/zhouzd2/letrain/platform/base_train.py", line 1195, in clone_fn
model_loss, end_points, feed_ops = network_fn(data_direct, data_batch, int_network_fn)
File "/home/zhouzd2/letrain/applications/letrain.py", line 214, in get_loss
FLAGS.batch_size)
File "/home/zhouzd2/letrain/applications/fmrcnn/get_fmrcnn_loss.py", line 23, in model_fn
loss_weights=[0.2, 0.2, 1.0, 0.2, 1.0])
File "/home/zhouzd2/letrain/applications/fmrcnn/libs/nets/pyramid_network.py", line 580, in build
is_training=is_training, gt_boxes=gt_boxes)
File "/home/zhouzd2/letrain/applications/fmrcnn/libs/nets/pyramid_network.py", line 263, in build_heads
assign_boxes(rois, [rois, batch_inds], [2, 3, 4, 5])
File "/home/zhouzd2/letrain/applications/fmrcnn/libs/layers/wrapper.py", line 173, in assign_boxes
inds = tf.where(tf.equal(assigned_layers, l))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2538, in where
return gen_array_ops.where(condition=condition, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 6087, in where
"Where", input=condition, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InternalError (see above for traceback): WhereOp: Could not launch cub::DeviceReduce::Sum to count number of true / nonzero indices. temp_storage_bytes: 1, status: no kernel image is available for execution on the device
[[Node: pyramid_1/AssignGTBoxes/Where_6 = Where[T=DT_BOOL, _device="/job:worker/replica:0/task:0/device:GPU:0"](pyramid_1/AssignGTBoxes/Equal_6_S9493)]]
[[Node: pyramid_1/AssignGTBoxes/Reshape_8_G1028 = _Recv[client_terminated=false, recv_device="/job:worker/replica:0/task:0/device:CPU:0", send_device="/job:worker/replica:0/task:0/device:GPU:0", send_device_incarnation=5407481677180697062, tensor_name="edge_1349_pyramid_1/AssignGTBoxes/Reshape_8", tensor_type=DT_INT64, _device="/job:worker/replica:0/task:0/device:CPU:0"]()]]
No problem when loading calculation graphs, error is reported in sess.run()。
Does anyone know how to solve this problem? Or does anyone know what function can replace tf.where?
Thank you!
If you are using Visual Studio:
Right click on the project > Properies > Cuda C/C++ > Device
and add the following to Code Generation field
compute_30,sm_30;compute_35,sm_35;compute_37,sm_37;compute_50,sm_50;compute_52,sm_52;compute_60,sm_60;compute_61,sm_61;compute_70,sm_70;compute_75,sm_75;

Error in 'ValidationMonitor' when passing 'metrics' parameter

I'm using the following code to log accuracy as the validation measure (TensorFlow 0.10):
validation_metrics = {"accuracy": tf.contrib.metrics.streaming_accuracy}
validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(
input_fn=input_fn_eval,
every_n_steps=FLAGS.eval_every,
# metrics=validation_metrics,
early_stopping_rounds=500,
early_stopping_metric="loss",
early_stopping_metric_minimize=True)
After running, in 'every_n_steps', I see the following lines in the output:
INFO:tensorflow:Validation (step 1000): loss = 1.04875, global_step = 900
The problem is that when 'metrics=validation_metrics' parameter uncomment in the above code, I get the following error in the validation phase:
INFO:tensorflow:Error reported to Coordinator: <type 'exceptions.TypeError'>, Input 'y' of 'Equal' Op has type int64 that does not match type float32 of argument 'x'.
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
[[Node: read_batch_features_train/file_name_queue/file_name_queue_EnqueueMany = QueueEnqueueMany[Tcomponents=[DT_STRING], _class=["loc:#read_batch_features_train/file_name_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_train/file_name_queue, read_batch_features_train/file_name_queue/RandomShuffle)]]
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
[[Node: read_batch_features_train/random_shuffle_queue_EnqueueMany = QueueEnqueueMany[Tcomponents=[DT_STRING, DT_STRING], _class=["loc:#read_batch_features_train/random_shuffle_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_train/random_shuffle_queue, read_batch_features_train/read/ReaderReadUpTo, read_batch_features_train/read/ReaderReadUpTo:1)]]
Traceback (most recent call last):
File "udc_train.py", line 74, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "udc_train.py", line 70, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[validation_monitor])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 240, in fit
max_steps=max_steps)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 578, in _train_model
max_steps=max_steps)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 280, in _supervised_train
None)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/supervised_session.py", line 270, in run
run_metadata=run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/recoverable_session.py", line 54, in run
run_metadata=run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/coordinated_session.py", line 70, in run
self._coord.join(self._coordinated_threads_to_join)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 357, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/coordinated_session.py", line 66, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 107, in run
induce_stop = monitor.step_end(monitors_step, monitor_outputs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 396, in step_end
return self.every_n_step_end(step, output)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 687, in every_n_step_end
steps=self.eval_steps, metrics=self.metrics, name=self.name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 356, in evaluate
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 630, in _evaluate_model
eval_dict = self._get_eval_ops(features, targets, metrics)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 877, in _get_eval_ops
result[name] = metric(predictions, targets)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/metrics/python/ops/metric_ops.py", line 432, in streaming_accuracy
is_correct = math_ops.to_float(math_ops.equal(predictions, labels))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 708, in equal
result = _op_def_lib.apply_op("Equal", x=x, y=y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 468, in apply_op
inferred_from[input_arg.type_attr]))
TypeError: Input 'y' of 'Equal' Op has type int64 that does not match type float32 of argument 'x'.
This looks like a problem with your input_fn and your estimator, which are returning different types for the label.

Tensorflow: Tried SVHN by editing label 0-9, still not working

I am trying to classify SVHN data set by following this tutorial:
https://www.tensorflow.org/versions/0.6.0/tutorials/deep_cnn/index.html
I am using train_32x32.mat file. In order to use it with CNN code (mentioned above), I converted this .mat file to several .bin files using this simple code:
import numpy as np
import scipy.io
from array import array
read_input = scipy.io.loadmat('data/train_32x32.mat')
j=0
output_file = open('data_batch_%d.bin' % j, 'ab')
for i in range(0, 64000):
# create new bin file
if i>0 and i % 12800 == 0:
output_file.close()
j=j+1
output_file = open('data_batch_%d.bin' % j, 'ab')
# Write to bin file
if read_input['y'][i] == 10:
read_input['y'][i] = 0
read_input['y'][i].astype('uint8').tofile(output_file)
read_input['X'][:,:,:,i].astype('uint32').tofile(output_file)
output_file.close()
But when I tried to classify SVHN using these customized .bin files I'm getting stuck with error "Invalid argument: Indices are not valid (out of bounds)" listed below:
Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 4
W tensorflow/core/common_runtime/executor.cc:1027] 0x1a53160 Compute status: Invalid argument: Indices are not valid (out of bounds). Shape: dim { size: 128 } dim { size: 10 }
[[Node: SparseToDense = SparseToDense[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](concat, SparseToDense/output_shape, SparseToDense/sparse_values, SparseToDense/default_value)]]
Traceback (most recent call last):
File "cifar10_train.py", line 138, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/default/_app.py", line 11, in run
sys.exit(main(sys.argv))
File "cifar10_train.py", line 134, in main
train()
File "cifar10_train.py", line 104, in train
_, loss_value = sess.run([train_op, loss])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 345, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 419, in _do_run
e.code)
tensorflow.python.framework.errors.InvalidArgumentError: Indices are not valid (out of bounds). Shape: dim { size: 128 } dim { size: 10 }
[[Node: SparseToDense = SparseToDense[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](concat, SparseToDense/output_shape, SparseToDense/sparse_values, SparseToDense/default_value)]]
Caused by op u'SparseToDense', defined at:
File "cifar10_train.py", line 138, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/default/_app.py", line 11, in run
sys.exit(main(sys.argv))
File "cifar10_train.py", line 134, in main
train()
File "cifar10_train.py", line 76, in train
loss = cifar10.loss(logits, labels)
File "/home/sarah/Documents/SVHN/cifar10.py", line 364, in loss
dense_labels = tf.sparse_to_dense(concated,[FLAGS.batch_size, NUM_CLASSES],1.0, 0.0)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_sparse_ops.py", line 153, in sparse_to_dense
default_value=default_value, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 988, in __init__
self._traceback = _extract_stack()
I found TensorFlow CIFAR10 Example , similar issue in stackoverflow. But even if I change the label, it's still not working.
Please let me know if I did something wrong or not understanding any logic.
Thanks
Sarah
Something was wrong with my installed version of Tensorflow (might be a bug). Upgrading to new version solved the issue.
Thanks