Related
So as I try to run my code it keeps saying that it can't find an image but it in fact is there. I've tried a lot of things but nothing works. Can someone please help me? This is code is trying to read images from the nist dataset and then i am trying to train a model. Here's my code and error code:
import os
import cv2
import tensorflow as tf
upper_level_dirs = open("/Users/cam/reader/top_level_dirs")
upper_level_dirs = upper_level_dirs.read().split()
print(upper_level_dirs)
file_names = []
labels = []
for folder in upper_level_dirs:
for filename in os.listdir("./train/" + folder + "/"):
file_names.append("./train/" + folder + "/" + filename)
labels.append(folder)
# Use a custom OpenCV function to read the image, instead of the standard
# TensorFlow `tf.read_file()` operation.
def _read_py_function(file_name, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_image(image_string, channels=3)
image_resized = tf.image.resize_image_with_crop_or_pad(image_decoded, 28, 28)
return image_resized, label
# image_decoded = cv2.imread(file_name.decode(), cv2.IMREAD_GRAYSCALE)
# return image_decoded, label
dataset = tf.data.Dataset.from_tensor_slices((file_names, labels))
dataset = dataset.map(_read_py_function)
sess = tf.InteractiveSession()
dataset = dataset.batch(32)
iterator = dataset.make_initializable_iterator()
next_element = iterator.get_next()
for _ in range(100):
sess.run(iterator.initializer)
while True:
try:
sess.run(next_element)
except tf.errors.OutOfRangeError:
break
Errors:
2018-03-27 12:59:58.001455: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-03-27 13:00:01.574038: W tensorflow/core/framework/op_kernel.cc:1278] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: train_7a_02017.png; No such file or directory
2018-03-27 13:00:01.576585: W tensorflow/core/framework/op_kernel.cc:1278] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: train_7a_02017.png; No such file or directory
2018-03-27 13:00:01.578373: W tensorflow/core/framework/op_kernel.cc:1278] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: train_7a_02017.png; No such file or directory
Traceback (most recent call last):
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1330, in _do_call
return fn(*args)
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1315, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1423, in _call_tf_sessionrun
status, run_metadata)
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: train_7a_02017.png; No such file or directory
[[Node: ReadFile = ReadFile[](ReadFile/filename)]]
[[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,28,28,?], [?]], output_types=[DT_UINT8, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](Iterator)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "split2.py", line 46, in <module>
sess.run(next_element)
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 908, in run
run_metadata_ptr)
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1143, in _run
feed_dict_tensor, options, run_metadata)
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1324, in _do_run
run_metadata)
File "/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1343, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: train_7a_02017.png; No such file or directory
[[Node: ReadFile = ReadFile[](ReadFile/filename)]]
[[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,28,28,?], [?]], output_types=[DT_UINT8, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](Iterator)]]
I also find that error and my solution is:
Maybe you create the path by windows and the path end with 'CR' when you view the txt with notepad++, which cant read by linux.
You can replace 'CR' with '' by notepad++. Then it works.
I had the same problem and using tf.as_str_any() solved the problem for me.
This because the file is an object and not a string, so you have to decode it (generally UTF-8) to a string.
So your code would look like this:
def _read_py_function(file_name, label):
image_string = tf.as_str_any(file_name)
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_image(image_string, channels=3)
image_resized = tf.image.resize_image_with_crop_or_pad(image_decoded, 28, 28)
return image_resized, label
I first constructed an RBM and tested it on a set of data, it worked well. Then I wrote a DBN with stacked RBM and trained it with the same set of data. The program stopped with the following error when it tried to train the second RBM.
Traceback (most recent call last):
File "D:\Python\DL_DG\analysis\debug\debug_01_ppi.py", line 44, in <module>
ppi_dbn.fit(ppi_in)
File "D:/Python/DL_DG/Model\dbn_test.py", line 95, in fit
rbm.fit(input_data)
File "D:/Python/DL_DG/Model\rbm_test.py", line 295, in fit
self.partial_fit(batch_x, b, e)
File "D:/Python/DL_DG/Model\rbm_test.py", line 188, in partial_fit
feed_dict={self.x: batch_x})
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 895, in run
run_metadata_ptr)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1321, in _do_run
options, run_metadata)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input/x' with dtype float and shape [?,128]
[[Node: input/x = Placeholder[dtype=DT_FLOAT, shape=[?,128], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op 'input/x', defined at:
File "<string>", line 1, in <module>
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\idlelib\run.py", line 142, in main
ret = method(*args, **kwargs)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\idlelib\run.py", line 460, in runcode
exec(code, self.locals)
File "D:\Python\DL_DG\analysis\debug\debug_01_ppi.py", line 42, in <module>
learning_rate_rbm=[0.001,0.01],rbm_gauss_visible=True)
File "D:/Python/DL_DG/Model\dbn_test.py", line 52, in __init__
sample_gauss_visible=self.sample_gauss_visible, sigma=self.sigma))
File "D:/Python/DL_DG/Model\rbm_test.py", line 358, in __init__
xavier_const,err_function,use_tqdm,tqdm)
File "D:/Python/DL_DG/Model\rbm_test.py", line 46, in __init__
self.x = tf.placeholder(tf.float32, [None, self.n_visible],name='x')
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1548, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 2094, in _placeholder
name=name)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Users\pil562\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input/x' with dtype float and shape [?,128]
[[Node: input/x = Placeholder[dtype=DT_FLOAT, shape=[?,128], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
The error occurs at the following function:
def partial_fit(self, batch_x, k, j):
print(batch_x.dtype, batch_x.shape)
summary, _ = self.sess.run([self.merged, self.update_weights + self.update_deltas],
feed_dict={self.x: batch_x})
self.train_writer.add_summary(summary, k*self.batch_size+j)
I output the type and shape of batch_x. The shape is the same during the whole training process. The type is float64 when training the first rbm, and float32 when training the second rbm. That's where it stopped and throw out the error.
The DBN worked well when I didn't compute the summary and just used the following code:
self.sess.run(self.update_weights + self.update_deltas,feed_dict={self.x: batch_x})
It also worked well if I only train a single RBM (with or without the summary).
The batch_x used to train the second RBM is probabilities of the hidden layer in the first RBM.
Could somebody help me solve this problem? I'm not sure if the float64 is the problem.
I guess it's hard for anyone to solve the problem only with the two pieces of code I give. lol. The full code is too long to post here.
I save the output of the first RBM and use it as input to train another RBM. It works well. Thus, I think the problem is not the type or shape of the feeded batch_x, but the structure of the DBN, or the way I collected summaries.
Hope my situation can help others with similar problems.
I'm following these tutorials:
https://www.youtube.com/watch?v=wuo4JdG3SvU&list=PL9Hr9sNUjfsmEu1ZniY0XpHSzl5uihcXZ
and prettytensor is introduced in tutorial 4.
Following the tutorial, i wrote this code to run a small neural network:
import tensorflow as tf
# Use PrettyTensor to simplify Neural Network construction.
import prettytensor as pt
from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets('../data/MNIST/', one_hot=True)
# We know that MNIST images are 28 pixels in each dimension.
img_size = 28
# Images are stored in one-dimensional arrays of this length.
img_size_flat = img_size * img_size
# Tuple with height and width of images used to reshape arrays.
img_shape = (img_size, img_size)
# Number of colour channels for the images: 1 channel for gray-scale.
num_channels = 1
# Number of classes, one class for each of 10 digits.
num_classes = 10
# the placeholders
x = tf.placeholder(tf.float32, shape=[None, img_size_flat], name='x')
x_image = tf.reshape(x, [-1, img_size, img_size, num_channels])
y_true = tf.placeholder(tf.float32, shape=[None, 10], name='y_true')
# use prettyTensor to build the model
# this will give us the predictions and the loss functions
x_pretty = pt.wrap(x_image)
with pt.defaults_scope(activation_fn=tf.nn.relu):
y_pred, loss = x_pretty.\
conv2d(kernel=5, depth=16, name='layer_conv1').\
max_pool(kernel=2, stride=2).\
conv2d(kernel=5, depth=36, name='layer_conv2').\
max_pool(kernel=2, stride=2).\
flatten().\
fully_connected(size=128, name='layer_fc1').\
softmax_classifier(class_count=10, labels=y_true)
# the model optimizer
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(loss)
# the model testing
correct_prediction = tf.equal(tf.argmax(y_pred,1), tf.argmax(y_true,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# start the session
session = tf.InteractiveSession()
# Start the training
tf.global_variables_initializer().run(session = session)
train_batch_size = 64
for i in range(1000):
print("training batch ",i)
x_batch, y_true_batch = data.train.next_batch(train_batch_size)
session.run(optimizer, feed_dict={x:x_batch, y_true:y_true_batch})
When i tried to run it, I got the following error:
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value layer_conv1/bias
[[Node: layer_conv1/bias/read = Identity[T=DT_FLOAT, _class=["loc:#layer_conv1/bias"], _device="/job:localhost/replica:0/task:0/cpu:0"](layer_conv1/bias)]]
Caused by op u'layer_conv1/bias/read', defined at:
File "/home/gal/Documents/Workspace/EclipseWorkspace/Melanoma Classification!/tutorial4/tutorial4Test.py", line 31, in <module>
the full error trace:
Traceback (most recent call last):
File "/home/gal/Documents/Workspace/EclipseWorkspace/Melanoma Classification!/tutorial4/tutorial4Test.py", line 55, in <module>
session.run(optimizer, feed_dict={x:x_batch, y_true:y_true_batch})
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value layer_conv1/bias
[[Node: layer_conv1/bias/read = Identity[T=DT_FLOAT, _class=["loc:#layer_conv1/bias"], _device="/job:localhost/replica:0/task:0/cpu:0"](layer_conv1/bias)]]
Caused by op u'layer_conv1/bias/read', defined at:
File "/home/gal/Documents/Workspace/EclipseWorkspace/Melanoma Classification!/tutorial4/tutorial4Test.py", line 31, in <module>
conv2d(kernel=5, depth=16, name='layer_conv1').\
File "/home/gal/anaconda2/lib/python2.7/site-packages/prettytensor/pretty_tensor_class.py", line 1981, in method
result = func(non_seq_layer, *args, **kwargs)
File "/home/gal/anaconda2/lib/python2.7/site-packages/prettytensor/pretty_tensor_image_methods.py", line 163, in __call__
y += self.variable('bias', [size[-1]], bias_init, dt=dtype)
File "/home/gal/anaconda2/lib/python2.7/site-packages/prettytensor/pretty_tensor_class.py", line 1695, in variable
collections=variable_collections)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1024, in get_variable
custom_getter=custom_getter)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 850, in get_variable
custom_getter=custom_getter)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 346, in get_variable
validate_shape=validate_shape)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 331, in _true_getter
caching_device=caching_device, validate_shape=validate_shape)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 677, in _get_single_variable
expected_shape=shape)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 224, in __init__
expected_shape=expected_shape)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 370, in _init_from_args
self._snapshot = array_ops.identity(self._variable, name="read")
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1424, in identity
result = _op_def_lib.apply_op("Identity", input=input, name=name)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/gal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value layer_conv1/bias
[[Node: layer_conv1/bias/read = Identity[T=DT_FLOAT, _class=["loc:#layer_conv1/bias"], _device="/job:localhost/replica:0/task:0/cpu:0"](layer_conv1/bias)]]
So my question is, How can i solve this error?
This problem is caused by a bug in the 0.12rc0 release candidate of TensorFlow, and the fact that Pretty Tensor uses a deprecated TensorFlow API (for which I've opened an issue).
Until this bug is fixed, the best workaround I can think of is a hack. Add the following line at the top of your program, after import tensorflow as tf:
tf.GraphKeys.VARIABLES = tf.GraphKeys.GLOBAL_VARIABLES
After successfully running all examples from slim walkthrough notebook, I wanted to freeze the graph. In order to do that, I ran the following (copy from original notebook):
import os
from datasets import flowers
from nets import inception
from preprocessing import inception_preprocessing
slim = tf.contrib.slim
image_size = inception.inception_v1.default_image_size
def get_init_fn():
"""Returns a function run by the chief worker to warm-start the training."""
checkpoint_exclude_scopes=["InceptionV1/Logits", "InceptionV1/AuxLogits"]
exclusions = [scope.strip() for scope in checkpoint_exclude_scopes]
variables_to_restore = []
for var in slim.get_model_variables():
excluded = False
for exclusion in exclusions:
if var.op.name.startswith(exclusion):
excluded = True
break
if not excluded:
variables_to_restore.append(var)
return slim.assign_from_checkpoint_fn(
os.path.join(checkpoints_dir, 'inception_v1.ckpt'),
variables_to_restore)
train_dir = '/tmp/inception_finetuned/'
with tf.Graph().as_default():
tf.logging.set_verbosity(tf.logging.INFO)
dataset = flowers.get_split('train', flowers_data_dir)
images, _, labels = load_batch(dataset, height=image_size, width=image_size)
# Create the model, use the default arg scope to configure the batch norm parameters.
with slim.arg_scope(inception.inception_v1_arg_scope()):
logits, _ = inception.inception_v1(images, num_classes=dataset.num_classes, is_training=True)
# Specify the loss function:
one_hot_labels = slim.one_hot_encoding(labels, dataset.num_classes)
slim.losses.softmax_cross_entropy(logits, one_hot_labels)
total_loss = slim.losses.get_total_loss()
# Create some summaries to visualize the training process:
tf.scalar_summary('losses/Total Loss', total_loss)
# Specify the optimizer and create the train op:
optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
train_op = slim.learning.create_train_op(total_loss, optimizer)
# Run the training:
final_loss = slim.learning.train(
train_op,
logdir=train_dir,
init_fn=get_init_fn(),
number_of_steps=2)
print('Finished training. Last batch loss %f' % final_loss)
The code above produced the following files in /tmp/inception_finetuned folder:
checkpoint
model.ckpt-0.meta
events.out.tfevents.1478081437.Nikos-MacBook-Pro.local
model.ckpt-2 graph.pbtxt
model.ckpt-2.meta model.ckpt-0
Then, in order to freeze the graph, I ran the following command:
bazel-bin/tensorflow/python/tools/freeze_graph --input_graph=/tmp/inception_finetuned/graph.pbtxt --input_checkpoint=/tmp/inception_finetuned/model.ckpt-2 --output_graph=/tmp/freeze.pb --output_node_names=InceptionV1/Logits/Predictions/Softmax
The command, however, produced the following error:
W tensorflow/core/framework/op_kernel.cc:968] Failed precondition: Attempting to use uninitialized value InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1
[[Node: _send_InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1_0 = _Send[T=DT_FLOAT, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=6007788667487390928, tensor_name="InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1:0", _device="/job:localhost/replica:0/task:0/cpu:0"](InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1)]]
...
W tensorflow/core/framework/op_kernel.cc:968] Failed precondition: Attempting to use uninitialized value InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1
[[Node: _send_InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1_0 = _Send[T=DT_FLOAT, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=6007788667487390928, tensor_name="InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1:0", _device="/job:localhost/replica:0/task:0/cpu:0"](InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1)]]
Traceback (most recent call last):
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/tools/freeze_graph.py", line 135, in <module>
tf.app.run()
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 32, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/tools/freeze_graph.py", line 132, in main
FLAGS.output_graph, FLAGS.clear_devices, FLAGS.initializer_nodes)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/tools/freeze_graph.py", line 121, in freeze_graph
sess, input_graph_def, output_node_names.split(","))
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/framework/graph_util.py", line 226, in convert_variables_to_constants
returned_variables = sess.run(variable_names)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1
[[Node: _send_InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1_0 = _Send[T=DT_FLOAT, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=6007788667487390928, tensor_name="InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1:0", _device="/job:localhost/replica:0/task:0/cpu:0"](InceptionV1/Logits/Conv2d_0c_1x1/biases/Adam_1)]]
Then I tried to use different optimizer:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
and got the following error:
W tensorflow/core/framework/op_kernel.cc:968] Failed precondition: Attempting to use uninitialized value global_step
[[Node: _send_global_step_0 = _Send[T=DT_INT64, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=8900174487477528080, tensor_name="global_step:0", _device="/job:localhost/replica:0/task:0/cpu:0"](global_step)]]
...
[[Node: _send_global_step_0 = _Send[T=DT_INT64, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=8900174487477528080, tensor_name="global_step:0", _device="/job:localhost/replica:0/task:0/cpu:0"](global_step)]]
W tensorflow/core/framework/op_kernel.cc:968] Failed precondition: Attempting to use uninitialized value global_step
[[Node: _send_global_step_0 = _Send[T=DT_INT64, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=8900174487477528080, tensor_name="global_step:0", _device="/job:localhost/replica:0/task:0/cpu:0"](global_step)]]
Traceback (most recent call last):
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/tools/freeze_graph.py", line 135, in <module>
tf.app.run()
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 32, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/tools/freeze_graph.py", line 132, in main
FLAGS.output_graph, FLAGS.clear_devices, FLAGS.initializer_nodes)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/tools/freeze_graph.py", line 121, in freeze_graph
sess, input_graph_def, output_node_names.split(","))
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/framework/graph_util.py", line 226, in convert_variables_to_constants
returned_variables = sess.run(variable_names)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/Users/nikogamulin/workspace/tensorflow/bazel-bin/tensorflow/python/tools/freeze_graph.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value global_step
[[Node: _send_global_step_0 = _Send[T=DT_INT64, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=8900174487477528080, tensor_name="global_step:0", _device="/job:localhost/replica:0/task:0/cpu:0"](global_step)]]
Similarly, if I retrain the model running the following command:
python train_image_classifier.py \
--train_dir=${TRAIN_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=flowers \
--dataset_split_name=train \
--model_name=inception_v1 \
--checkpoint_path=${CHECKPOINT_PATH} \
--checkpoint_exclude_scopes=InceptionV1/Logits,InceptionV1/AuxLogits/Logits \
--trainable_scopes=InceptionV1/Logits,InceptionV1/AuxLogits/Logits
and try to freeze the graph, get the errors related to
global_step
Does anyone know why the above errors occur and how to solve them? If anyone managed to freeze inception v1 (tf-slim) graph, I would be thankful for any suggestions that might solve the issue.
I am trying to use the RMSPropOptimizer for minimizing loss. Here's the part of the code that is relevant:
import tensorflow as tf
#build large convnet...
#...
opt = tf.train.RMSPropOptimizer(learning_rate=0.0025, decay=0.95)
#do stuff to get targets and loss...
#...
grads_and_vars = opt.compute_gradients(loss)
capped_grads_and_vars = [(tf.clip_by_value(g, -1, 1), v) for g, v in grads_and_vars]
opt_op = self.opt.apply_gradients(capped_grads_and_vars)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
while(1):
sess.run(opt_op)
Problem is as soon as I run this I get the following error:
W tensorflow/core/common_runtime/executor.cc:1091] 0x10a0bba40 Compute status: Failed precondition: Attempting to use uninitialized value train/output/bias/RMSProp
[[Node: RMSProp/update_train/output/bias/ApplyRMSProp = ApplyRMSProp[T=DT_FLOAT, use_locking=false, _device="/job:localhost/replica:0/task:0/cpu:0"](train/output/bias, train/output/bias/RMSProp, train/output/bias/RMSProp_1, RMSProp/learning_rate, RMSProp/decay, RMSProp/momentum, RMSProp/epsilon, clip_by_value_9)]]
[[Node: _send_MergeSummary/MergeSummary_0 = _Send[T=DT_STRING, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=-6901001318975381332, tensor_name="MergeSummary/MergeSummary:0", _device="/job:localhost/replica:0/task:0/cpu:0"](MergeSummary/MergeSummary)]]
Traceback (most recent call last):
File "dqn.py", line 213, in <module>
result = sess.run(opt_op)
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 385, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 461, in _do_run
e.code)
tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value train/output/bias/RMSProp
[[Node: RMSProp/update_train/output/bias/ApplyRMSProp = ApplyRMSProp[T=DT_FLOAT, use_locking=false, _device="/job:localhost/replica:0/task:0/cpu:0"](train/output/bias, train/output/bias/RMSProp, train/output/bias/RMSProp_1, RMSProp/learning_rate, RMSProp/decay, RMSProp/momentum, RMSProp/epsilon, clip_by_value_9)]]
Caused by op u'RMSProp/update_train/output/bias/ApplyRMSProp', defined at:
File "dqn.py", line 159, in qLearnMinibatch
opt_op = self.opt.apply_gradients(capped_grads_and_vars)
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 288, in apply_gradients
update_ops.append(self._apply_dense(grad, var))
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/training/rmsprop.py", line 103, in _apply_dense
grad, use_locking=self._use_locking).op
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/training/gen_training_ops.py", line 171, in apply_rms_prop
grad=grad, use_locking=use_locking, name=name)
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 659, in apply_op
op_def=op_def)
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1904, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/home/miniconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1083, in __init__
self._traceback = _extract_stack()
Note that I don't get this error If am using the usual GradientDescentOptimizer. I am initializing my variables as you can see above but I don't know what 'train/output/bias/RMSProp' is because I don't create any such variable. I only have 'train/output/bias/' which does get initialized above.
Thanks!
So for people from the future running into similar trouble, I found this post helpful:
Tensorflow: Using Adam optimizer
Basically, I was running
sess.run(tf.initialize_all_variables())
before I had defined my loss minimization op
loss = tf.square(targets)
#create the gradient descent op
grads_and_vars = opt.compute_gradients(loss)
capped_grads_and_vars = [(tf.clip_by_value(g, -self.clip_delta, self.clip_delta), v) for g, v in grads_and_vars] #gradient capping
self.opt_op = self.opt.apply_gradients(capped_grads_and_vars)
This needs to be done before running the initialization op!