How to use the trained tensorflow network to run inference? - tensorflow

I am new to tensorflow and I hope you can help me.
I have built a tensorflow CNN network and trained it successfully. The training datasets are matlab arrays. Now I would like to use the trained network to run inference. I am not sure how to write the python code for inference.
During training, I saved the mode. I am not sure how to load the model in inference.
My inference data is also a matlab array, same as training data. How can I use it? During training, I used miniPatch from Tensorlayer, should I use miniPatch in inference two?
Below is my inference code: it gave a lot of errors:
print("\n\nPreparing testing data........................")
test_data = sio.loadmat('MyTest.mat')
Z0 = test_data['Real_testing1']
img_num_test = Z0.shape[0]
X_test = np.empty([img_num_test, 128, 128, 1], dtype=float)
X_test[:,:,:,0] = Z0
Y_test = np.column_stack((np.ones([img_num_test, 1], dtype=int),np.zeros([img_num_test, 1], dtype=int)))
print("\tTesting X shape: {0}".format(X_test.shape))
print("\tTesting Y shape: {0}".format(Y_test.shape))
print("\n\Restore the network ...")
save_dir = "checkpoints/";
epoch = 1000
model_name = save_dir + str(epoch) + '_model'
if not os.path.exists(save_dir):
os.makedirs(save_dir)
saver = tf.train.Saver().restore(sess, save_path=model_name)
start_time_begin = time.time()
print("\n\Running network...")
start_time = time.time()
y = model.Scribenet(X_test[0, :, :, :], False, 1.0)
y = sess.run([y], feed_dict=feed_dict)
print(y[0:9])
sess.close()
Below is my training code:
x = tf.placeholder(tf.float32, shape=[None, 128, 128, 1], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, 2], name='y_')
keep_prob = tf.placeholder(tf.float32, name='keep_prob')
is_training = tf.placeholder(tf.bool, name='is_traininng')
net_in = x
net_out = model.MyCNN(net_in, is_training, keep_prob)
y = net_out
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_, name='cost'))
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
y_op = tf.argmax(tf.nn.softmax(y),1)
train_op = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999,
epsilon=1e-08, use_locking=False).minimize(cost)
sess.run(tf.global_variables_initializer())
save_dir = "checkpoints/";
if not os.path.exists(save_dir):
os.makedirs(save_dir)
saver = tf.train.Saver()
print("\n\nStart training the network ...")
start_time_begin = time.time()
for epoch in range(n_epoch):
start_time = time.time()
loss_ep = 0; n_step = 0
for X_train_a, y_train_a in tl.iterate.minibatches(X_train, Y_train,
batch_size, shuffle=True):
feed_dict = {x: X_train_a, y_: y_train_a, is_training: True, keep_prob: train_keep_prob}
loss, _ = sess.run([cost, train_op], feed_dict=feed_dict)
loss_ep += loss
n_step += 1
loss_ep = loss_ep/ n_step
if (epoch+1) % save_freq == 0:
model_name = save_dir + str(epoch+1) + '_model'
saver.save(sess, save_path=model_name)

The main issue seems to be that there's no graph building in your inference code. You either need to save the whole graph (in SavedModel format), or build a graph in your inference code and load your variables via a training checkpoint (probably the easiest to start). As long as the variable names are the same, you can load variables saved from the training graph into the inference graph.
So inference will be your training code but without the y_ placeholder and without the loss/optimizer logic. You can feed a single image (batch size 1) to start, so no need for batching logic either.

Related

tensorflow - linear regression does not give intended computational graph

I am trying to train a very simple linear regression with tensorflow but the loss doesn't decrease and the tensorboard also doesn't look right
### Generate data
w_true = np.array([1.0,2.0])
b_true = 0.5
x_train = np.random.multivariate_normal(mean=[0,0], cov=[[1,0],[0,1]], size=100)
x_test = np.random.multivariate_normal(mean=[0,0], cov=[[3,0],[0,3]], size=100)
y_train = np.dot(x_train,w_true) + b_true
y_test = np.dot(x_test,w_true) + b_true
### Placeholders for data input
x = tf.placeholder(dtype=tf.float32, shape=[None,2], name="x")
y = tf.placeholder(dtype=tf.float32, shape=[None], name="labels")
### Trainable parameters
w = tf.Variable(initial_value=np.random.multivariate_normal([0,0],[[1,0],[0,1]]), dtype=tf.float32,
name="W")
b = tf.Variable(initial_value=np.random.normal(1), dtype=tf.float32,name="B")
### Computational graph
y_pred = tf.tensordot(x,w,1)+b
tf.summary.histogram("weights",w)
tf.summary.histogram("bias",b)
loss = tf.reduce_sum(tf.squared_difference(y,y_pred), name="loss")
tf.summary.scalar("loss", loss)
with tf.name_scope("train"):
train_step = tf.train.GradientDescentOptimizer(0.00001).minimize(loss)
### Training
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# For TensorBoard
writer = tf.summary.FileWriter("path_to_some_folder")
writer.add_graph(sess.graph)
for t in range(1000):
x_batch = x_train[np.random.choice(100, 20)]
y_batch = y_train[np.random.choice(100, 20)]
sess.run(train_step, {x:x_batch,y:y_batch})
print(sess.run(loss, {x:x_train,y:y_train}))
print(sess.run(loss, {x:x_test,y:y_test}))
I have tried different step sizes but the error always stays above 400 on the training and 1000 on the test set. I have tested that tf.tensordot() behaves like I expect. I you would like to see the tensorboard just replace the path_to_some_folder and after training run tensorboard --logdir path_to_some_folder
Thanks very much for the help
Your problem is because of the following two lines,
x_batch = x_train[np.random.choice(100, 20)]
y_batch = y_train[np.random.choice(100, 20)]
In each iteration, np.random.choice(100, 20) returns two different index lists for x_batch and y_batch. Therefore, your x_batch and y_batch will never match. Instead, replace that part with the following code.
BATCH_SIZE= 10
N_COUNT = len(x_train)
for t in range(1000):
for start, end in zip(range(0, N_COUNT, BATCH_SIZE),
range(BATCH_SIZE, N_COUNT + 1,BATCH_SIZE)):
x_batch = x_train[start:end]
y_batch = y_train[start:end]
sess.run(train_step, {x:x_batch,y:y_batch})
Hope this helps.

How to restore saved BiRNN model in tensorflow so that all output neurons correctly bundled to the corresponding output classes

I faced a problem with properly restoring the saved model in tensorflow. I created the Bidirectional RNN model in tensorflow with following code:
batchX_placeholder = tf.placeholder(tf.float32, [None, timesteps, 1],
name="batchX_placeholder")])
batchY_placeholder = tf.placeholder(tf.float32, [None, num_classes],
name="batchY_placeholder")
weights = tf.Variable(np.random.rand(2*STATE_SIZE, num_classes),
dtype=tf.float32, name="weights")
biases = tf.Variable(np.zeros((1, num_classes)), dtype=tf.float32,
name="biases")
logits = BiRNN(batchX_placeholder, weights, biases)
with tf.name_scope("prediction"):
prediction = tf.nn.softmax(logits)
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=batchY_placeholder))
lr = tf.Variable(learning_rate, trainable=False, dtype=tf.float32,
name='lr')
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
train_op = optimizer.minimize(loss_op)
init_op = tf.initialize_all_variables()
saver = tf.train.Saver()
The architecture of BiRNN created with the following function:
def BiRNN(x, weights, biases):
# Unstack to get a list of 'time_steps' tensors of shape (batch_size,
# num_input)
x = tf.unstack(x, time_steps, 1)
# Forward and Backward direction cells
lstm_fw_cell = rnn.BasicLSTMCell(STATE_SIZE, forget_bias=1.0)
lstm_bw_cell = rnn.BasicLSTMCell(STATE_SIZE, forget_bias=1.0)
outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell,
lstm_bw_cell, x, dtype=tf.float32)
# Linear activation, using rnn inner loop last output
return tf.matmul(outputs[-1], weights) + biases
Then I train a model and save it after each 200 steps:
with tf.Session() as sess:
sess.run(init_op)
current_step = 0
for batch_x, batch_y in get_minibatch():
sess.run(train_op, feed_dict={batchX_placeholder: batch_x,
batchY_placeholder: batch_y})
current_step += 1
if current_step % 200 == 0:
saver.save(sess, os.path.join(model_dir, "model")
To run the saved model in inference mode I use saved tensorflow graph in "model.meta" file:
graph = tf.get_default_graph()
saver = tf.train.import_meta_graph(os.path.join(model_dir, "model.meta"))
sess = tf.Session()
saver.restore(sess, tf.train.latest_checkpoint(model_dir)
weights = graph.get_tensor_by_name("weights:0")
biases = graph.get_tensor_by_name("biases:0")
batchX_placeholder = graph.get_tensor_by_name("batchX_placeholder:0")
batchY_placeholder = graph.get_tensor_by_name("batchY_placeholder:0")
logits = BiRNN(batchX_placeholder, weights, biases)
prediction = graph.get_operation_by_name("prediction/Softmax")
argmax_pred = tf.argmax(prediction, 1)
init = tf.global_variables_initializer()
sess.run(init)
for x_seq, y_gt in get_sequence():
_, y_pred = sess.run([prediction, argmax_pred],
feed_dict={batchX_placeholder: [x_seq]],
batchY_placeholder: [[0.0, 0.0]]})
print("Y ground true: " + str(y_gt) + ", Y pred: " + str(y_pred[0]))
And when I run the code in inference mode, I get different results each time I launch it. It seems that output neurons from the softmax layer randomly bundled with different output classes.
So, my question is: How can I save and then correctly restore the model in tensorflow, so that all neurons properly bundled with corresponding output classes?
There is no need to call tf.global_variables_initializer(), I think that is your problem.
I removed some operations: logits, weights and biases since you don't need them, all those are already loaded, use graph.get_tensor_by_name to get them.
For the prediction, get the tensor instead of the operation. (see this answer):
This is the code:
graph = tf.get_default_graph()
saver = tf.train.import_meta_graph(os.path.join(model_dir, "model.meta"))
sess = tf.Session()
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
batchX_placeholder = graph.get_tensor_by_name("batchX_placeholder:0")
batchY_placeholder = graph.get_tensor_by_name("batchY_placeholder:0")
prediction = graph.get_tensor_by_name("prediction/Softmax:0")
argmax_pred = tf.argmax(prediction, 1)
Edit 1: I notice that I wasn't clear on why you got different results.
And when I run the code in inference mode, I get different results
each time I launch it.
Notice that although you used the weights from the loaded model, you are creating the BiRNN again, and the BasicLSTMCell also have weights and other variables that you don't set from your loaded model, hence they need to be initialized (with new random values) resulting in an untrained model again.

Issue exporting trained Tensorflow model parameters to SavedModel format

I have built a system that leverages Google ML Engine to train various text classifiers using a simple flat CNN architecture (borrowed from the excellent WildML post). I've also leveraged heavily the ML Engine trainer template which exists here - specifically using the Tensorflow core functions.
My issue is that while the model trains and learns parameters correctly, I cannot get the serialized export in the binary SavedModel format (i.e. - the .pb files) to maintain the learned weights. I can tell this by using the gcloud predict local API on the model export folder and each time it makes randomized predictions - leading me to believe that while the graph structure is being saved to the proto-buf format, the associated weights in the checkpoint file are not being carried over.
Here's the code for my run function:
def run(...):
# ... code to load and transform train/test data
with train_graph.as_default():
with tf.Session(graph=train_graph).as_default() as session:
# Features and label tensors as read using filename queue
features, labels = model.input_fn(
x_train,
y_train,
num_epochs=num_epochs,
batch_size=train_batch_size
)
# Returns the training graph and global step tensor
tf.logging.info("Train vocab size: {:d}".format(vocab_size))
train_op, global_step_tensor, cnn, train_summaries = model.model_fn(
model.TRAIN,
sequence_length,
num_classes,
label_values,
vocab_size,
embedding_size,
filter_sizes,
num_filters
)
tf.logging.info("Created simple training CNN with ({}) filter types".format(filter_sizes))
# Setup writers
train_summary_op = tf.summary.merge(train_summaries)
train_summary_dir = os.path.join(job_dir, "summaries", "train")
# Generate writer
train_summary_writer = tf.summary.FileWriter(train_summary_dir, session.graph)
# Initialize all variables
session.run(tf.global_variables_initializer())
session.run(tf.local_variables_initializer())
model_dir = os.path.abspath(os.path.join(job_dir, "model"))
if not os.path.exists(model_dir):
os.makedirs(model_dir)
saver = tf.train.Saver()
def train_step(x_batch, y_batch):
"""
A single training step
"""
feed_dict = {
cnn.input_x: x_batch,
cnn.input_y: y_batch,
cnn.dropout_keep_prob: 0.5
}
step, _, loss, accuracy = session.run([global_step_tensor, train_op, cnn.loss, cnn.accuracy],
feed_dict=feed_dict)
time_str = datetime.datetime.now().isoformat()
if step % 10 == 0:
tf.logging.info("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
# Return current step
return step
def eval_step(x_batch, y_batch, train_step, total_steps):
"""
Evaluates model on a dev set
"""
feed_dict = {
cnn.input_x: x_batch,
cnn.input_y: y_batch,
cnn.dropout_keep_prob: 1.0
}
step, loss, accuracy, scores, predictions = session.run([global_step_tensor, cnn.loss, cnn.accuracy, cnn.scores, cnn.predictions],
feed_dict=feed_dict)
# Get metrics
y_actual = np.argmax(y_batch, 1)
model_metrics = precision_recall_fscore_support(y_actual, predictions)
#print(scores)
time_str = datetime.datetime.now().isoformat()
print("\n---- EVAULATION ----")
avg_precision = np.mean(model_metrics[0], axis=0)
avg_recall = np.mean(model_metrics[1], axis=0)
avg_f1 = np.mean(model_metrics[2], axis=0)
print("{}: step {}, loss {:g}, acc {:g}, prec {:g}, rec {:g}, f1 {:g}".format(time_str, step, loss, accuracy, avg_precision, avg_recall, avg_f1))
print("Model metrics: ", model_metrics)
print("---- EVALUATION ----\n")
# Generate batches
batches = data_helpers.batch_iter(
list(zip(features, labels)), train_batch_size, num_epochs)
# Training loop. For each batch...
for batch in batches:
x_batch, y_batch = zip(*batch)
current_step = train_step(x_batch, y_batch)
if current_step % 20 == 0 or current_step == 1:
eval_step(x_eval, y_eval, current_step, total_steps)
# Checkpoint directory. Tensorflow assumes this directory already exists so we need to create it
print(model_dir)
trained_model = saver.save(session, os.path.join(job_dir, 'model') + "/model.ckpt", global_step=current_step)
print(trained_model)
print("Saved final model checkpoint to {}".format(trained_model))
# Only perform this if chief
if is_chief:
build_and_run_exports(trained_model, job_dir,
model.SERVING_INPUT_FUNCTIONS[model.TEXT],
sequence_length, num_classes, label_values,
vocab_size, embedding_size, filter_sizes,
num_filters, vocab_processor)
And my build_and_run_exports function:
def build_and_run_exports(...):
# Check if we export already exists - if so delete
export_dir = os.path.join(job_dir, 'export')
if os.path.exists(export_dir):
print("Export currently exists - going to delete:", export_dir)
shutil.rmtree(export_dir)
# Create exporter
exporter = tf.saved_model.builder.SavedModelBuilder(export_dir)
# Restore prediction graph
prediction_graph = tf.Graph()
with prediction_graph.as_default():
with tf.Session(graph=prediction_graph) as session:
# Get training data
features, inputs_dict = serving_input_fn()
# Setup inputs
inputs_info = {
name: tf.saved_model.utils.build_tensor_info(tensor)
for name, tensor in inputs_dict.iteritems()
}
# Load model
cnn = TextCNN(
sequence_length=sequence_length,
num_classes=num_classes,
vocab_size=vocab_size,
embedding_size=embedding_size,
filter_sizes=list(map(int, filter_sizes.split(","))),
num_filters=num_filters,
input_tensor=features)
# Restore model
saver = tf.train.Saver()
saver.restore(session, latest_checkpoint)
# Setup outputs
outputs = {
'logits': cnn.scores,
'probabilities': cnn.probabilities,
'predicted_indices': cnn.predictions
}
# Create output info
output_info = {
name: tf.saved_model.utils.build_tensor_info(tensor)
for name, tensor in outputs.iteritems()
}
# Setup signature definition
signature_def = tf.saved_model.signature_def_utils.build_signature_def(
inputs=inputs_info,
outputs=output_info,
method_name=sig_constants.PREDICT_METHOD_NAME
)
# Create graph export
exporter.add_meta_graph_and_variables(
session,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
sig_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
},
legacy_init_op=tf.saved_model.main_op.main_op()
)
# Export model
exporter.save()
And last, but not least, the TextCNN model:
class TextCNN(object):
"""
A CNN for text classification.
Uses an embedding layer, followed by a convolutional, max-pooling and softmax layer.
"""
def __init__(
self, sequence_length, num_classes, vocab_size,
embedding_size, filter_sizes, num_filters, l2_reg_lambda=0.0,
dropout_keep_prob=0.5, input_tensor=None):
# Setup input
if input_tensor != None:
self.input_x = input_tensor
self.dropout_keep_prob = tf.constant(1.0)
else:
self.input_x = tf.placeholder(tf.int32, [None, sequence_length], name="input_x")
self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")
# Placeholders for input, output and dropout
self.input_y = tf.placeholder(tf.int32, [None, num_classes], name="input_y")
# Keeping track of l2 regularization loss (optional)
l2_loss = tf.constant(0.0)
# Embedding layer
with tf.device('/cpu:0'), tf.name_scope("embedding"):
self.W = tf.Variable(
tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
name="W")
self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x)
self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)
# Create a convolution + maxpool layer for each filter size
pooled_outputs = []
for i, filter_size in enumerate(filter_sizes):
with tf.name_scope("conv-maxpool-%s" % filter_size):
# Convolution Layer
filter_shape = [filter_size, embedding_size, 1, num_filters]
W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
conv = tf.nn.conv2d(
self.embedded_chars_expanded,
W,
strides=[1, 1, 1, 1],
padding="VALID",
name="conv")
# Apply nonlinearity
h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")
# Maxpooling over the outputs
pooled = tf.nn.max_pool(
h,
ksize=[1, sequence_length - filter_size + 1, 1, 1],
strides=[1, 1, 1, 1],
padding='VALID',
name="pool")
pooled_outputs.append(pooled)
# Combine all the pooled features
num_filters_total = num_filters * len(filter_sizes)
self.h_pool = tf.concat(pooled_outputs, 3)
self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])
# Add dropout
with tf.name_scope("dropout"):
self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)
# Final (unnormalized) scores and predictions
with tf.name_scope("output"):
W = tf.get_variable(
"W",
shape=[num_filters_total, num_classes],
initializer=tf.contrib.layers.xavier_initializer())
b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
l2_loss += tf.nn.l2_loss(W)
l2_loss += tf.nn.l2_loss(b)
self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
self.predictions = tf.argmax(self.scores, 1, name="predictions")
# CalculateMean cross-entropy loss
with tf.name_scope("loss"):
losses = tf.nn.softmax_cross_entropy_with_logits(logits=self.scores, labels=self.input_y)
self.loss = tf.reduce_mean(losses) + l2_reg_lambda * l2_loss
with tf.name_scope("probabilities"):
self.probabilities = tf.nn.softmax(logits=self.scores)
# Accuracy
with tf.name_scope("accuracy"):
correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
I'm hoping I'm just missing something simple in how I'm creating the TF graph / session and restoring stats.
Thank you in advance for your help!
This behavior is caused due to the behavior of tf.saved_model.main_op.main_op() which randomly initializes all of the variables in the graph (code). However, legacy_init_op happens after the variables are restored from the checkpoint (restore happens here followed by legacy_init_op here).
The solution is simply to not re-initialize all of the variables, for example, in your code:
from tensorflow.python.ops import variables
from tensorflow.python.ops import lookup_ops
from tensorflow.python.ops import control_flow_ops
def my_main_op():
init_local = variables.local_variables_initializer()
init_tables = lookup_ops.tables_initializer()
return control_flow_ops.group(init_local, init_tables)
def build_and_run_exports(...):
...
# Create graph export
exporter.add_meta_graph_and_variables(
session,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
sig_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
},
legacy_init_op=my_main_op()
)
# Export model
exporter.save()

TensorFlow: Why do parameters not update when GradientDescentOptimizer train step is run?

When I run the following code, it prints a constant loss at every training step; I also tried printing the parameters, which also do not change.
I can't seem to figure out why train_step, which uses a GradientDescentOptimizer, doesnt change the weights in W_fc1, b_fc1, W_fc2, and b_fc2.
I'm a beginner to machine learning so I might be missing something obvious.
(An answer for a similar question was that weights should not be initialized at zero, but the weights here are initialized with truncated normal so that cant be the problem).
import tensorflow as tf
import numpy as np
import csv
import random
with open('wine_data.csv', 'rb') as csvfile:
input_arr = list(csv.reader(csvfile, delimiter=','))
for i in range(len(input_arr)):
input_arr[i][0] = int(input_arr[i][0]) - 1 # 0 index for one hot
for j in range(1, len(input_arr[i])):
input_arr[i][j] = float(input_arr[i][j])
random.shuffle(input_arr)
training_data = np.array(input_arr[:2*len(input_arr)/3]) # train on first two thirds of data
testing_data = np.array(input_arr[2*len(input_arr)/3:]) # test on last third of data
x_train = training_data[0:, 1:]
y_train = training_data[0:, 0]
x_test = testing_data[0:, 1:]
y_test = testing_data[0:, 0]
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
x = tf.placeholder(tf.float32, shape=[None, 13], name='x')
y_ = tf.placeholder(tf.float32, shape=[None], name='y_')
y_one_hot = tf.one_hot(tf.cast(y_, tf.int32), 3) # actual y values
W_fc1 = weight_variable([13, 128])
b_fc1 = bias_variable([128])
fc1 = tf.matmul(x, W_fc1)+b_fc1
W_fc2 = weight_variable([128, 3])
b_fc2 = bias_variable([3])
y = tf.nn.softmax(tf.matmul(fc1, W_fc2)+b_fc2)
cross_entropy = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_one_hot, logits=y))
train_step = tf.train.GradientDescentOptimizer(1e-17).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_one_hot,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for _ in range(1000):
train_step.run(feed_dict={x: x_train, y_: y_train})
if _%10 == 0:
loss = cross_entropy.eval(feed_dict={x: x_train, y_: y_train})
print('step', _, 'loss', loss)
Thanks in advance.
From the official tensorflow documentation:
WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.
Remove the softmax on y before feeding it into tf.nn.softmax_cross_entropy_with_logits
Also set your learning rate to something higher (like 3e-4)

Restoring saved TensorFlow model to evaluate on test set

I have seen a few posts on restoring TF models and the Google doc page on exporting graphs but I think I am missing something.
I use the code in this Gist to save the model along with this utils file to which defines the model
Now I would like to restore it and run in a previously unseen test data as follows:
def evaluate(X_data, y_data):
num_examples = len(X_data)
total_accuracy = 0
total_loss = 0
sess = tf.get_default_session()
acc_steps = len(X_data) // BATCH_SIZE
for i in range(acc_steps):
batch_x, batch_y = next_batch(X_val, Y_val, BATCH_SIZE)
loss, accuracy = sess.run([loss_value, acc], feed_dict={
images_placeholder: batch_x,
labels_placeholder: batch_y,
keep_prob: 0.5
})
total_accuracy += (accuracy * len(batch_x))
total_loss += (loss * len(batch_x))
return (total_accuracy / num_examples, total_loss / num_examples)
## re-execute the code that defines the model
# Image Tensor
images_placeholder = tf.placeholder(tf.float32, shape=[None, 32, 32, 3], name='x')
gray = tf.image.rgb_to_grayscale(images_placeholder, name='gray')
gray /= 255.
# Label Tensor
labels_placeholder = tf.placeholder(tf.float32, shape=(None, 43), name='y')
# dropout Tensor
keep_prob = tf.placeholder(tf.float32, name='drop')
# construct model
logits = inference(gray, keep_prob)
# calculate loss
loss_value = loss(logits, labels_placeholder)
# training
train_op = training(loss_value, 0.001)
# accuracy
acc = accuracy(logits, labels_placeholder)
with tf.Session() as sess:
loader = tf.train.import_meta_graph('gtsd.meta')
loader.restore(sess, tf.train.latest_checkpoint('./'))
sess.run(tf.initialize_all_variables())
test_accuracy = evaluate(X_test, y_test)
print("Test Accuracy = {:.3f}".format(test_accuracy[0]))
I'm getting a test accuracy of only 3%. However If I don't close the Notebook and run the test code immediately after training the model, I get a 95% accuracy.
This leads me to believe I'm not loading the model correctly?
The problem arises from these two lines:
loader.restore(sess, tf.train.latest_checkpoint('./'))
sess.run(tf.initialize_all_variables())
The first line loads the saved model from a checkpoint. The second line re-initializes all of the variables in the model (such as the weight matrices, convolutional filters, and bias vectors), usually to random numbers, and overwrites the loaded values.
The solution is simple: delete the second line (sess.run(tf.initialize_all_variables())) and evaluation will proceed with the trained values loaded from the checkpoint.
PS. There is a small chance that this change will give you an error about "uninitialized variables". In that case, you should execute sess.run(tf.initialize_all_variables()) to initialize any variables not saved in the checkpoint before executing loader.restore(sess, tf.train.latest_checkpoint('./')).
I had a similar problem and for me this worked:
with tf.Session() as sess:
saver=tf.train.Saver(tf.all_variables())
saver=tf.train.import_meta_graph('model.meta')
saver.restore(sess,"model")
test_accuracy = evaluate(X_test, y_test)
The answer found here is what ended up working as follows:
save_path = saver.save(sess, '/home/ubuntu/gtsd-12-23-16.chkpt')
print("Model saved in file: %s" % save_path)
## later re-run code that creates the model
# Image Tensor
images_placeholder = tf.placeholder(tf.float32, shape=[None, 32, 32, 3], name='x')
gray = tf.image.rgb_to_grayscale(images_placeholder, name='gray')
gray /= 255.
# Label Tensor
labels_placeholder = tf.placeholder(tf.float32, shape=(None, 43), name='y')
# dropout Tensor
keep_prob = tf.placeholder(tf.float32, name='drop')
# construct model
logits = inference(gray, keep_prob)
# calculate loss
loss_value = loss(logits, labels_placeholder)
# training
train_op = training(loss_value, 0.001)
# accuracy
acc = accuracy(logits, labels_placeholder)
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, '/home/ubuntu/gtsd-12-23-16.chkpt')
print("Model restored.")
test_accuracy = evaluate(X_test, y_test)
print("Test Accuracy = {:.3f}".format(test_accuracy[0]*100))