I have seen a few posts on restoring TF models and the Google doc page on exporting graphs but I think I am missing something.
I use the code in this Gist to save the model along with this utils file to which defines the model
Now I would like to restore it and run in a previously unseen test data as follows:
def evaluate(X_data, y_data):
num_examples = len(X_data)
total_accuracy = 0
total_loss = 0
sess = tf.get_default_session()
acc_steps = len(X_data) // BATCH_SIZE
for i in range(acc_steps):
batch_x, batch_y = next_batch(X_val, Y_val, BATCH_SIZE)
loss, accuracy = sess.run([loss_value, acc], feed_dict={
images_placeholder: batch_x,
labels_placeholder: batch_y,
keep_prob: 0.5
})
total_accuracy += (accuracy * len(batch_x))
total_loss += (loss * len(batch_x))
return (total_accuracy / num_examples, total_loss / num_examples)
## re-execute the code that defines the model
# Image Tensor
images_placeholder = tf.placeholder(tf.float32, shape=[None, 32, 32, 3], name='x')
gray = tf.image.rgb_to_grayscale(images_placeholder, name='gray')
gray /= 255.
# Label Tensor
labels_placeholder = tf.placeholder(tf.float32, shape=(None, 43), name='y')
# dropout Tensor
keep_prob = tf.placeholder(tf.float32, name='drop')
# construct model
logits = inference(gray, keep_prob)
# calculate loss
loss_value = loss(logits, labels_placeholder)
# training
train_op = training(loss_value, 0.001)
# accuracy
acc = accuracy(logits, labels_placeholder)
with tf.Session() as sess:
loader = tf.train.import_meta_graph('gtsd.meta')
loader.restore(sess, tf.train.latest_checkpoint('./'))
sess.run(tf.initialize_all_variables())
test_accuracy = evaluate(X_test, y_test)
print("Test Accuracy = {:.3f}".format(test_accuracy[0]))
I'm getting a test accuracy of only 3%. However If I don't close the Notebook and run the test code immediately after training the model, I get a 95% accuracy.
This leads me to believe I'm not loading the model correctly?
The problem arises from these two lines:
loader.restore(sess, tf.train.latest_checkpoint('./'))
sess.run(tf.initialize_all_variables())
The first line loads the saved model from a checkpoint. The second line re-initializes all of the variables in the model (such as the weight matrices, convolutional filters, and bias vectors), usually to random numbers, and overwrites the loaded values.
The solution is simple: delete the second line (sess.run(tf.initialize_all_variables())) and evaluation will proceed with the trained values loaded from the checkpoint.
PS. There is a small chance that this change will give you an error about "uninitialized variables". In that case, you should execute sess.run(tf.initialize_all_variables()) to initialize any variables not saved in the checkpoint before executing loader.restore(sess, tf.train.latest_checkpoint('./')).
I had a similar problem and for me this worked:
with tf.Session() as sess:
saver=tf.train.Saver(tf.all_variables())
saver=tf.train.import_meta_graph('model.meta')
saver.restore(sess,"model")
test_accuracy = evaluate(X_test, y_test)
The answer found here is what ended up working as follows:
save_path = saver.save(sess, '/home/ubuntu/gtsd-12-23-16.chkpt')
print("Model saved in file: %s" % save_path)
## later re-run code that creates the model
# Image Tensor
images_placeholder = tf.placeholder(tf.float32, shape=[None, 32, 32, 3], name='x')
gray = tf.image.rgb_to_grayscale(images_placeholder, name='gray')
gray /= 255.
# Label Tensor
labels_placeholder = tf.placeholder(tf.float32, shape=(None, 43), name='y')
# dropout Tensor
keep_prob = tf.placeholder(tf.float32, name='drop')
# construct model
logits = inference(gray, keep_prob)
# calculate loss
loss_value = loss(logits, labels_placeholder)
# training
train_op = training(loss_value, 0.001)
# accuracy
acc = accuracy(logits, labels_placeholder)
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, '/home/ubuntu/gtsd-12-23-16.chkpt')
print("Model restored.")
test_accuracy = evaluate(X_test, y_test)
print("Test Accuracy = {:.3f}".format(test_accuracy[0]*100))
Related
I don't know why occur this problem,I have checked many times, I have feed xs and ys to feed_dict. So, what is the reason for this problem? How do I modify my code to solve these error? Below is the error log.
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_2' with dtype float and shape [?,10]
[[node Placeholder_2 (defined at /home/jiayu/dropout.py:41) = Placeholder[dtype=DT_FLOAT, shape=[?,10], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[{{node Mean_5/_55}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_271_Mean_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
This code run on ubuntu 16.04, tensorflow 1.12.0 and python 3.6.8.
from __future__ import print_function
import tensorflow as tf
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
# load data
digits = load_digits()
X = digits.data
y = digits.target
y = LabelBinarizer().fit_transform(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3)
def add_layer(inputs, in_size, out_size, layer_name, activation_function=None, ):
# add one more layer and return the output of this layer
Weights = tf.Variable(tf.random_normal([in_size, out_size]))
biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, )
Wx_plus_b = tf.matmul(inputs, Weights) + biases
# here to dropout
Wx_plus_b = tf.nn.dropout(Wx_plus_b, keep_prob)
if activation_function is None:
outputs = Wx_plus_b
else:
outputs = activation_function(Wx_plus_b, )
tf.summary.histogram(layer_name + '/outputs', outputs)
return outputs
# define placeholder for inputs to network
keep_prob = tf.placeholder(tf.float32)
xs = tf.placeholder(tf.float32, [None, 64]) # 8x8
ys = tf.placeholder(tf.float32, [None, 10])
# add output layer
l1 = add_layer(xs, 64, 50, 'l1', activation_function=tf.nn.tanh)
prediction = add_layer(l1, 50, 10, 'l2', activation_function=tf.nn.softmax)
# the loss between prediction and real data
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),reduction_indices=[1])) # loss
tf.summary.scalar('loss', cross_entropy)
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.Session()
merged = tf.summary.merge_all()
# summary writer goes in here
train_writer = tf.summary.FileWriter("logs/train", sess.graph)
test_writer = tf.summary.FileWriter("logs/test", sess.graph)
# tf.initialize_all_variables() no long valid from
# 2017-03-02 if using tensorflow >= 0.12
if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:
init = tf.initialize_all_variables()
else:
init = tf.global_variables_initializer()
sess.run(init)
for i in range(500):
# here to determine the keeping probability
sess.run(train_step, feed_dict={xs: X_train, ys: y_train, keep_prob: 1})
if i % 50 == 0:
# record loss
train_result = sess.run(merged, feed_dict={xs: X_train, ys: y_train, keep_prob: 1})
test_result = sess.run(merged, feed_dict={xs: X_test, ys: y_test, keep_prob: 1})
train_writer.add_summary(train_result, i)
test_writer.add_summary(test_result, i)
The right result is display scale in tensorboard.
You cannot run the script more than once because otherwise you are creating nested graph
For the first run, it will run OK without any errors. But when you run it more than once, nested computation graph will be created. You can view the behavior in tensorboard, after several runs, the computation graph will get bigger and bigger, and when you try to evaluate the bigger graph, extra placeholders simply don't get data fed to them and they will give error.
Here is the simple solution. Use ft.reset_default_graph() and put it before the place where you create the graph
tf.reset_default_graph()
# define placeholder for inputs to network
keep_prob = tf.placeholder(tf.float32, name='prob')
xs = tf.placeholder(tf.float32, [None, 64], name='x_input') # 8x8
ys = tf.placeholder(tf.float32, [None, 10], name='y_input')
...
some further reading Remove nodes from graph or reset entire default graph
I have the tf.event files present in folder, I input the command to view but yet I am not able to see the graph
Please find the code attached, the code related to graph is provided.
I am using tensorflow 1.8, upgrading had lot of issues, so i am using lower version.
#Initialize the FileWriter
with tf.Session() as sess:
writer = tf.summary.FileWriter("./Training_FileWriter/", sess.graph)
writer1 = tf.summary.FileWriter("./Validation_FileWriter/", sess.graph)
#Add the cost and accuracy to summary
tf.summary.scalar('loss', tf.squeeze(cross_entropy))
tf.summary.scalar('accuracy', tf.squeeze(accuracy))
#Merge all summaries together
merged_summary = tf.summary.merge_all()
#
#
#After executing loss, optimizer, accuracy
summ = sess.run(merged_summary, feed_dict=feed_dict_train)
writer.add_summary(summ, epoch*int(len(trainLabels)/batch_size) + batch)
Will it help if you have a full-fledged example like this ? I am able to view the graphs.
tensorboard --logdir=D:\Development_Avecto\TensorFlow\logs\1\train
TensorBoard 1.9.0 at http://LT032871:6006 (Press CTRL+C to quit)
import tensorflow as tf
# reset everything to rerun in jupyter
tf.reset_default_graph()
# config
batch_size = 100
learning_rate = 0.5
training_epochs = 5
logs_path = "D:/Development_Avecto/TensorFlow/logs/1/train"
# load mnist data set
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
# input images
with tf.name_scope('input'):
# None -> batch size can be any size, 784 -> flattened mnist image
x = tf.placeholder(tf.float32, shape=[None, 784], name="x-input")
# target 10 output classes
y_ = tf.placeholder(tf.float32, shape=[None, 10], name="y-input")
# model parameters will change during training so we use tf.Variable
with tf.name_scope("weights"):
W = tf.Variable(tf.zeros([784, 10]))
# bias
with tf.name_scope("biases"):
b = tf.Variable(tf.zeros([10]))
# implement model
with tf.name_scope("softmax"):
# y is our prediction
y = tf.nn.softmax(tf.matmul(x, W) + b)
# specify cost function
with tf.name_scope('cross_entropy'):
# this is our cost
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
# specify optimizer
with tf.name_scope('train'):
# optimizer is an "operation" which we can execute in a session
train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)
with tf.name_scope('Accuracy'):
# Accuracy
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# create a summary for our cost and accuracy
tf.summary.scalar("cost", cross_entropy)
tf.summary.scalar("accuracy", accuracy)
# merge all summaries into a single "operation" which we can execute in a session
summary_op = tf.summary.merge_all()
with tf.Session() as sess:
# variables need to be initialized before we can use them
sess.run(tf.initialize_all_variables())
# create log writer object
writer = tf.summary.FileWriter(logs_path, graph=tf.get_default_graph())
# perform training cycles
for epoch in range(training_epochs):
# number of batches in one epoch
batch_count = int(mnist.train.num_examples / batch_size)
for i in range(batch_count):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# perform the operations we defined earlier on batch
_, summary = sess.run([train_op, summary_op], feed_dict={x: batch_x, y_: batch_y})
# write log
writer.add_summary(summary, epoch * batch_count + i)
if epoch % 5 == 0:
print
"Epoch: ", epoch
print
"Accuracy: ", accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})
print
"done"
I have the following classification model.
I would like to get a numpy array similar to y_t which is the test labels one hot encoded. However I keep getting variable error.
# Construct placeholders
with graph.as_default():
inputs_ = tf.placeholder(tf.float32, [None, seq_len, n_channels], name = 'inputs')
labels_ = tf.placeholder(tf.float32, [None, n_classes], name = 'labels')
keep_prob_ = tf.placeholder(tf.float32, name = 'keep')
learning_rate_ = tf.placeholder(tf.float32, name = 'learning_rate')
with graph.as_default():
# (batch, 100, 3) --> (batch, 50, 6)
conv1 = tf.layers.conv1d(inputs=inputs_, filters=6, kernel_size=2, strides=1,
padding='same', activation = tf.nn.relu)
max_pool_1 = tf.layers.max_pooling1d(inputs=conv1, pool_size=2, strides=2, padding='same')
with graph.as_default():
# Flatten and add dropout
flat = tf.reshape(max_pool_1, (-1, 6*6))
flat = tf.nn.dropout(flat, keep_prob=keep_prob_)
# Predictions
logits = tf.layers.dense(flat, n_classes)
# Cost function and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels_))
optimizer = tf.train.AdamOptimizer(learning_rate_).minimize(cost)
# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(labels_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32), name='accuracy')
Then I use the test set
with tf.Session(graph=graph) as sess:
# Restore
saver.restore(sess, tf.train.latest_checkpoint('bschkpnt-cnn'))
for x_t, y_t in get_batches(X_test, y_test, batch_size):
feed = {inputs_: x_t,
labels_: y_t,
keep_prob_: 1}
batch_acc = sess.run(accuracy, feed_dict=feed)
test_acc.append(batch_acc)
print("Test accuracy: {:.6f}".format(np.mean(test_acc)))
y_t is a nX3 bumpy array.
I want to get a y_pred in similar format
Thanks
soft = tf.nn.softmax(logits)
this will be your probability distribution such that sum(soft) = 1. Every value in this array will indicate how sure the model is about the class.
pred = sess.run(soft, feed_dict=feed)
print(pred)
So basically all I do is place an additional softmax, since you have it inbuilt in the loss you calculate, you've to place it again to predict. Then I ask for the output prediction, and just feed the feed_dict again.
Hope this helped!
I am new to tensorflow and I hope you can help me.
I have built a tensorflow CNN network and trained it successfully. The training datasets are matlab arrays. Now I would like to use the trained network to run inference. I am not sure how to write the python code for inference.
During training, I saved the mode. I am not sure how to load the model in inference.
My inference data is also a matlab array, same as training data. How can I use it? During training, I used miniPatch from Tensorlayer, should I use miniPatch in inference two?
Below is my inference code: it gave a lot of errors:
print("\n\nPreparing testing data........................")
test_data = sio.loadmat('MyTest.mat')
Z0 = test_data['Real_testing1']
img_num_test = Z0.shape[0]
X_test = np.empty([img_num_test, 128, 128, 1], dtype=float)
X_test[:,:,:,0] = Z0
Y_test = np.column_stack((np.ones([img_num_test, 1], dtype=int),np.zeros([img_num_test, 1], dtype=int)))
print("\tTesting X shape: {0}".format(X_test.shape))
print("\tTesting Y shape: {0}".format(Y_test.shape))
print("\n\Restore the network ...")
save_dir = "checkpoints/";
epoch = 1000
model_name = save_dir + str(epoch) + '_model'
if not os.path.exists(save_dir):
os.makedirs(save_dir)
saver = tf.train.Saver().restore(sess, save_path=model_name)
start_time_begin = time.time()
print("\n\Running network...")
start_time = time.time()
y = model.Scribenet(X_test[0, :, :, :], False, 1.0)
y = sess.run([y], feed_dict=feed_dict)
print(y[0:9])
sess.close()
Below is my training code:
x = tf.placeholder(tf.float32, shape=[None, 128, 128, 1], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, 2], name='y_')
keep_prob = tf.placeholder(tf.float32, name='keep_prob')
is_training = tf.placeholder(tf.bool, name='is_traininng')
net_in = x
net_out = model.MyCNN(net_in, is_training, keep_prob)
y = net_out
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_, name='cost'))
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
y_op = tf.argmax(tf.nn.softmax(y),1)
train_op = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999,
epsilon=1e-08, use_locking=False).minimize(cost)
sess.run(tf.global_variables_initializer())
save_dir = "checkpoints/";
if not os.path.exists(save_dir):
os.makedirs(save_dir)
saver = tf.train.Saver()
print("\n\nStart training the network ...")
start_time_begin = time.time()
for epoch in range(n_epoch):
start_time = time.time()
loss_ep = 0; n_step = 0
for X_train_a, y_train_a in tl.iterate.minibatches(X_train, Y_train,
batch_size, shuffle=True):
feed_dict = {x: X_train_a, y_: y_train_a, is_training: True, keep_prob: train_keep_prob}
loss, _ = sess.run([cost, train_op], feed_dict=feed_dict)
loss_ep += loss
n_step += 1
loss_ep = loss_ep/ n_step
if (epoch+1) % save_freq == 0:
model_name = save_dir + str(epoch+1) + '_model'
saver.save(sess, save_path=model_name)
The main issue seems to be that there's no graph building in your inference code. You either need to save the whole graph (in SavedModel format), or build a graph in your inference code and load your variables via a training checkpoint (probably the easiest to start). As long as the variable names are the same, you can load variables saved from the training graph into the inference graph.
So inference will be your training code but without the y_ placeholder and without the loss/optimizer logic. You can feed a single image (batch size 1) to start, so no need for batching logic either.
I faced a problem with properly restoring the saved model in tensorflow. I created the Bidirectional RNN model in tensorflow with following code:
batchX_placeholder = tf.placeholder(tf.float32, [None, timesteps, 1],
name="batchX_placeholder")])
batchY_placeholder = tf.placeholder(tf.float32, [None, num_classes],
name="batchY_placeholder")
weights = tf.Variable(np.random.rand(2*STATE_SIZE, num_classes),
dtype=tf.float32, name="weights")
biases = tf.Variable(np.zeros((1, num_classes)), dtype=tf.float32,
name="biases")
logits = BiRNN(batchX_placeholder, weights, biases)
with tf.name_scope("prediction"):
prediction = tf.nn.softmax(logits)
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=batchY_placeholder))
lr = tf.Variable(learning_rate, trainable=False, dtype=tf.float32,
name='lr')
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
train_op = optimizer.minimize(loss_op)
init_op = tf.initialize_all_variables()
saver = tf.train.Saver()
The architecture of BiRNN created with the following function:
def BiRNN(x, weights, biases):
# Unstack to get a list of 'time_steps' tensors of shape (batch_size,
# num_input)
x = tf.unstack(x, time_steps, 1)
# Forward and Backward direction cells
lstm_fw_cell = rnn.BasicLSTMCell(STATE_SIZE, forget_bias=1.0)
lstm_bw_cell = rnn.BasicLSTMCell(STATE_SIZE, forget_bias=1.0)
outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell,
lstm_bw_cell, x, dtype=tf.float32)
# Linear activation, using rnn inner loop last output
return tf.matmul(outputs[-1], weights) + biases
Then I train a model and save it after each 200 steps:
with tf.Session() as sess:
sess.run(init_op)
current_step = 0
for batch_x, batch_y in get_minibatch():
sess.run(train_op, feed_dict={batchX_placeholder: batch_x,
batchY_placeholder: batch_y})
current_step += 1
if current_step % 200 == 0:
saver.save(sess, os.path.join(model_dir, "model")
To run the saved model in inference mode I use saved tensorflow graph in "model.meta" file:
graph = tf.get_default_graph()
saver = tf.train.import_meta_graph(os.path.join(model_dir, "model.meta"))
sess = tf.Session()
saver.restore(sess, tf.train.latest_checkpoint(model_dir)
weights = graph.get_tensor_by_name("weights:0")
biases = graph.get_tensor_by_name("biases:0")
batchX_placeholder = graph.get_tensor_by_name("batchX_placeholder:0")
batchY_placeholder = graph.get_tensor_by_name("batchY_placeholder:0")
logits = BiRNN(batchX_placeholder, weights, biases)
prediction = graph.get_operation_by_name("prediction/Softmax")
argmax_pred = tf.argmax(prediction, 1)
init = tf.global_variables_initializer()
sess.run(init)
for x_seq, y_gt in get_sequence():
_, y_pred = sess.run([prediction, argmax_pred],
feed_dict={batchX_placeholder: [x_seq]],
batchY_placeholder: [[0.0, 0.0]]})
print("Y ground true: " + str(y_gt) + ", Y pred: " + str(y_pred[0]))
And when I run the code in inference mode, I get different results each time I launch it. It seems that output neurons from the softmax layer randomly bundled with different output classes.
So, my question is: How can I save and then correctly restore the model in tensorflow, so that all neurons properly bundled with corresponding output classes?
There is no need to call tf.global_variables_initializer(), I think that is your problem.
I removed some operations: logits, weights and biases since you don't need them, all those are already loaded, use graph.get_tensor_by_name to get them.
For the prediction, get the tensor instead of the operation. (see this answer):
This is the code:
graph = tf.get_default_graph()
saver = tf.train.import_meta_graph(os.path.join(model_dir, "model.meta"))
sess = tf.Session()
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
batchX_placeholder = graph.get_tensor_by_name("batchX_placeholder:0")
batchY_placeholder = graph.get_tensor_by_name("batchY_placeholder:0")
prediction = graph.get_tensor_by_name("prediction/Softmax:0")
argmax_pred = tf.argmax(prediction, 1)
Edit 1: I notice that I wasn't clear on why you got different results.
And when I run the code in inference mode, I get different results
each time I launch it.
Notice that although you used the weights from the loaded model, you are creating the BiRNN again, and the BasicLSTMCell also have weights and other variables that you don't set from your loaded model, hence they need to be initialized (with new random values) resulting in an untrained model again.