I'm new to neural networks
i have created a simple network according to this tutorial. It is trained to clarify text among 3 categories:
sport, graphics and space
https://medium.freecodecamp.org/big-picture-machine-learning-classifying-text-with-neural-networks-and-tensorflow-d94036ac2274
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0.
total_batch = int(len(newsgroups_train.data)/batch_size)
print("total_batch",total_batch)
# Loop over all batches
for i in range(total_batch):
batch_x,batch_y = get_batch(newsgroups_train,i,batch_size)
# Run optimization op (backprop) and cost op (to get loss value)
c,cc = sess.run([loss,optimizer], feed_dict={input_tensor: batch_x,output_tensor:batch_y})
print("C = ", c)
print("Cc = ", cc)
# Compute average loss
avg_cost += c / total_batch
# Display logs per epoch step
if epoch % display_step == 0:
print("inpt ten =", batch_y)
print("Epoch:", '%04d' % (epoch+1), "loss=", \
"{:.9f}".format(avg_cost))
I wonder how after training i can feed this model with my own text and get the result
Thanks
Like janu777 said, we can save and load models for reuse. We first create a Saver object and then save the session (after the model is trained):
saver = tf.train.Saver()
... train the model ...
save_path = saver.save(sess, "/tmp/model.ckpt")
In the example model the last "step" in the model architecture (i.e. the last thing done inside the multilayer_perceptron method) is:
'out': tf.Variable(tf.random_normal([n_classes]))
So to get a prediction we get the index of the maximum value of this array (the predicted class):
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
classification = sess.run(tf.argmax(prediction, 1), feed_dict={input_tensor: input_array})
print("Predicted category:", classification)
You can check the whole code here: https://github.com/dmesquita/understanding_tensorflow_nn
Tensorflow has option to save and load models for reuse.
You can save your trained model by adding this:
model_saver = tf.train.Saver()
#Training cycle
#your code to train
model_saver.save(sess,MODEL_SAVE_PATH)
Once your model is saved you can restore it again and test it like this:
model_saver.restore(sess, MODEL_SAVE_PATH)
c,cc = sess.run([loss,optimizer], feed_dict={input_tensor: batch_x,output_tensor:batch_y})
Here batch_x and batch_y represents your test data.
check this for more details on saving and restoring models.
Hope you find this helpful.
Related
The tensorboard shows multiple graphs for training and validation accuracy for each step and I want it to show the changes in both accuracy on a single graph.
def accuracy(predictions, labels):
return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))
/ predictions.shape[0])
num_steps = 20000
with tf.Session(graph = graph) as session:
tf.global_variables_initializer().run()
print(loss.eval())
summary_op = tf.summary.merge_all()
summaries_dir = '/loggg/'
train_writer = tf.summary.FileWriter(summaries_dir, graph)
for step in range(num_steps):
_,l, predictions = session.run([optimizer, loss, predict_train])
if (step % 2000 == 0):
#print(predictions[3:6])
print('Loss at step %d: %f' % (step, l))
training = accuracy( predictions, y_train[:, :])
validation = accuracy(predict_valid.eval(), y_test)
print('Training accuracy: %.1f%%' % training)
print('Validation accuracy: %.1f%%' % validation)
accuracy_summary = tf.summary.scalar("Training_Accuracy", training)
validation_summary = tf.summary.scalar("Validation_Accuracy", validation)
Result = session.run(summary_op)
train_writer.add_summary(Result, step)
train_writer.close()
result
Image of tensorboard showing multiple training and validation accuracy on different graphs
Each call to tf.summary.scalar() defines a new op in the graph, so since you're calling it within your train loop, each iteration generates a different summary op which each have a different _1, _2, etc. suffix, and this results in many different plots within TensorBoard.
If you're just getting started, I'd recommend trying out the Keras API or using eager execution, which both make it easier to avoid this problem.
If you need to use the graph + session model explicitly, then the entire graph should be constructed up-front, including the accuracy computation (converted to TensorFlow ops, not numpy), the tf.summary.scalar() calls to record the accuracies, and finally the tf.summary.merge_all() op. Then within the training loop you would only do sess.run(), writer.add_summary(), and writer.flush().
i dont understand your code completly but i do it like that:
...
correct_predict=tf.equal(tf.argmax(logits,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_predict,tf.float32))
tf.summary.scalar("acc", accuracy)
...
write_op = tf.summary.merge_all()
...
with tf.Session() as sess:
writer = tf.summary.FileWriter("graph/", sess.graph)
...
if step%10==0:
summ=sess.run(write_op,feed_dict={x:x_test,y:y_test})
writer.add_summary(summ,step)
writer.flush()
I am new to tensorflow and I am trying to build an image classifier. I have successfully created the model and I am trying to predict a single image after restoring the model. I have gone through various tutorials (https://github.com/sankit1/cv-tricks.com/blob/master/Tensorflow-tutorials/tutorial-2-image-classifier/predict.py) but I can't figure out the feed-dict thing in my code. I am stuck at predict fnction after loading the saved model. Can someone please help me and tell me what to do after loading all the variables from the saved model?
This is the train function which returns the parameters and save them in a model.
def trainModel(train, test, learning_rate=0.0001, num_epochs=2, minibatch_size=32, graph_filename='costs'):
"""
Implements a three-layer tensorflow neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SOFTMAX.
Input:
train : training set
test : test set
learning_rate : learning rate
num_epochs : number of epochs
minibatch_size : size of minibatch
print_cost : True to print the cost every epoch
Returns:
parameters : parameters learnt by the model
"""
ops.reset_default_graph() #for rerunning the model without resetting tf vars
# input and output shapes
(n_x, m) = train.images.T.shape
n_y = train.labels.T.shape[0]
costs = [] #var for storing the costs for later use
# create placeholders
X, Y = placeholderCreator(n_x, n_y)
parameters = paramInitializer()
# Forward propagation
Z3 = forwardPropagation(X, parameters)
# Cost function
cost = costCalc(Z3, Y)
#Backpropagation using adam optimizer
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
# Initialize tf variables
init = tf.global_variables_initializer()
minibatch_size = 32
# Start session to compute Tensorflow graph
with tf.Session() as sess:
# Run initialization
sess.run(init)
for epoch in range(num_epochs): # Training loop
epoch_cost = 0.
num_minibatches = int(m / minibatch_size)
for i in range(num_minibatches):
minibatch_X, minibatch_Y = train.next_batch(minibatch_size) # Get next batch of training data and labels
_, minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X.T, Y: minibatch_Y.T}) # Execute optimizer and cost function
epoch_cost += minibatch_cost / num_minibatches # Update epoch cost
saver = tf.train.Saver()
# Save parameters
parameters = sess.run(parameters)
saver.save(sess, "~/trained-model.ckpt")
return parameters
And this is my predict function where I am trying to predict an image. I have converted that image into MNIST format for ease of use (predicting_data). I load the model that I saved, use a softmax function on the output of 3rd layer (final output).
def predict():
train = predicting_data.train
(n_x, m) = train.images.T.shape
n_y = train.labels.T.shape[0]
X, Y = placeholderCreator(n_x, n_y)
with tf.Session() as sess:
new_saver = tf.train.import_meta_graph('~/trained-model.ckpt.meta')
new_saver.restore(sess, '~/trained-model.ckpt')
W1 = tf.get_default_graph().get_tensor_by_name('W1:0')
b1 = tf.get_default_graph().get_tensor_by_name('b1:0')
W2 = tf.get_default_graph().get_tensor_by_name('W2:0')
b2 = tf.get_default_graph().get_tensor_by_name('b2:0')
W3 = tf.get_default_graph().get_tensor_by_name('W3:0')
b3 = tf.get_default_graph().get_tensor_by_name('b3:0')
# forward propagation
Z1 = tf.add(tf.matmul(W1,X), b1)
A1 = tf.nn.relu(Z1)
Z2 = tf.add(tf.matmul(W2,A1), b2)
A2 = tf.nn.relu(Z2)
Z3 = tf.add(tf.matmul(W3,A2), b3)
y_pred = tf.nn.softmax(Z3) ####what to do after this????
cost = sess.run(y_pred, feed_dict={X: train.images.T})
Thank you in advance!
As vijay says in his comment:
Your predict part is not right, you need to get the input and predict tensors from the saved graph using the get_tensor_by_name() function and then use it in your sess.run
If you look at this post, it covers a similar problem and has some code examples.
In your code, you can pass 1 to the next_batch method and get just one image.
minibatch_X, minibatch_Y = train.next_batch(1)
I built a model in Tensorflow and I'm trying to convert it into a TensorFlow Estimator. Here is an example of what I have:
train_op = tf.train.AdamOptimizer(learning_rate=lr).minimize(cost)
saver = tf.train.Saver()
init = tf.global_variables_initializer()
assign_Wvh = pretrained_rsm.temporal_assignment(params['W'])
with tf.Session() as sess:
sess.run(init)
for epoch in range(epochs):
start = time.time()
_ = sess.run(train_op, feed_dict={x: input})
print("%i. elapsed time: %0.2f" % (epoch, time.time() - start))
# before saving the weights do an operation to change the weights
# only need to perform it once at the end to avoid unecessary operations
# that are time consuming at each iteration
_ = sess.run(assign_Wvh)
# save the weights
save_path = saver.save(sess, os.path.join(weights_path, 'init.ckpt'))
I was thinking of adding this line to my model_fn (estimator):
tf.train.get_global_step() == 1000: # 1000 is my specific epoch
do operation
But obviously I can't do that with an estimator.
Does someone know how to achieve such a thing? Knowing that I still need to save my weights that will be transformed by this last operation.
I am trying to use Tensorboard to visualize my training procedure. My purpose is, when every epoch completed, I would like to test the network's accuracy using the whole validation dataset, and store this accuracy result into a summary file, so that I can visualize it in Tensorboard.
I know Tensorflow has summary_op to do it, however it seems only work for one batch when running the code sess.run(summary_op). I need to calculate the accuracy for the whole dataset. How?
Is there any example to do it?
Define a tf.scalar_summary that accepts a placeholder:
accuracy_value_ = tf.placeholder(tf.float32, shape=())
accuracy_summary = tf.scalar_summary('accuracy', accuracy_value_)
Then calculate the accuracy for the whole dataset (define a routine that calculates the accuracy for every batch in the dataset and extract the mean value) and save it into a python variable, let's call it va.
Once you have the value of va, just run the accuracy_summary op, feeding the accuracy_value_ placeholder:
sess.run(accuracy_summary, feed_dict={accuracy_value_: va})
I implement a naive one-layer model as an example to classify MNIST dataset and visualize validation accuracy in Tensorboard, it works for me.
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets
import os
# number of epoch
num_epoch = 1000
model_dir = '/tmp/tf/onelayer_model/accu_info'
# mnist dataset location, change if you need
data_dir = '../data/mnist'
# load MNIST dataset without one hot
dataset = read_data_sets(data_dir, one_hot=False)
# Create placeholder for input images X and labels y
X = tf.placeholder(tf.float32, [None, 784])
# one_hot = False
y = tf.placeholder(tf.int32)
# One layer model graph
W = tf.Variable(tf.truncated_normal([784, 10], stddev=0.1))
b = tf.Variable(tf.constant(0.1, shape=[10]))
logits = tf.nn.relu(tf.matmul(X, W) + b)
init = tf.initialize_all_variables()
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, y)
# loss function
loss = tf.reduce_mean(cross_entropy)
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
_, top_1_op = tf.nn.top_k(logits)
top_1 = tf.reshape(top_1_op, shape=[-1])
correct_classification = tf.cast(tf.equal(top_1, y), tf.float32)
# accuracy function
acc = tf.reduce_mean(correct_classification)
# define info that is used in SummaryWritter
acc_summary = tf.scalar_summary('valid_accuracy', acc)
valid_summary_op = tf.merge_summary([acc_summary])
with tf.Session() as sess:
# initialize all the variable
sess.run(init)
print("Writing Summaries to %s" % model_dir)
train_summary_writer = tf.train.SummaryWriter(model_dir, sess.graph)
# load validation dataset
valid_x = dataset.validation.images
valid_y = dataset.validation.labels
for epoch in xrange(num_epoch):
batch_x, batch_y = dataset.train.next_batch(100)
feed_dict = {X: batch_x, y: batch_y}
_, acc_value, loss_value = sess.run(
[train_op, acc, loss], feed_dict=feed_dict)
vsummary = sess.run(valid_summary_op,
feed_dict={X: valid_x,
y: valid_y})
# Write validation accuracy summary
train_summary_writer.add_summary(vsummary, epoch)
Using batching with your validation set is possible in case you are using tf.metrics ops, which use internal counters. Here is a simplified example:
model = create_model()
tf.summary.scalar('cost', model.cost_op)
acc_value_op, acc_update_op = tf.metrics.accuracy(labels,predictions)
summary_common = tf.summary.merge_all()
summary_valid = tf.summary.merge([
tf.summary.scalar('accuracy', acc_value_op),
# other metrics here...
])
with tf.Session() as sess:
train_writer = tf.summary.FileWriter(logs_path + '/train',
sess.graph)
valid_writer = tf.summary.FileWriter(logs_path + '/valid')
While training, only write the common summary using your train-writer:
summary = sess.run(summary_common)
train_writer.add_summary(summary, tf.train.global_step(sess, gstep_op))
train_writer.flush()
After every validation, write both summaries using the valid-writer:
gstep, summaryc, summaryv = sess.run([gstep_op, summary_common, summary_valid])
valid_writer.add_summary(summaryc, gstep)
valid_writer.add_summary(summaryv, gstep)
valid_writer.flush()
When using tf.metrics, don't forget to reset the internal counters (local variables) before every validation step.
I am trying to build a Tensorflow example with a simple multl-layer
perceptron (MLP) functionality with one hidden layer. However, when I tested it and compared to other software e.g. Kaldi nnet1, the convergence during the training is not efficient, or cannot be comparable to Kaldi nnet1. I tried my best to make all the parameters the same (input, int target, batch size, learning rate, etc.), however, still confused where could be the reasons. Some pieces of codes are as follows:
Initialization:
self.weight = [tf.Variable(tf.truncated_normal([440, 8192],stddev=0.1))]
self.bias = [tf.Variable(tf.constant(0.01, shape=8192))]
self.weight.append( tf.Variable(tf.truncated_normal([8192, 8],stddev=0.1)) )
self.bias.append( tf.Variable(tf.constant(0.01, shape=8)) )
self.act = [tf.nn.sigmoid( tf.matmul(self.input, self.weight[0]) + self.bias[0] )]
self.nn_out = tf.matmul(self.act, self.weight[1]) + self.bias[1])
self.nn_softmax = tf.nn.softmax(self.nn_out)
self.nn_tgt = tf.placeholder("int64", shape=[None,])
self.cost_mean = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(self.nn_out, self.nn_tgt))
self.train_step = tf.train.GradientDescentOptimizer(self.learn_rate).minimize(self.cost_mean)
# saver
self.saver = tf.train.Saver()
self.sess = tf.Session()
self.sess.run(tf.initialize_all_variables())
Training:
for epoch in xrange(20):
feats_tr, tgts_tr = shuffle(feats_tr, tgts_tr, random_state=777)
# restore the exisiting model
ckpt = tf.train.get_checkpoint_state(ckpt_dir)
if ckpt and ckpt.model_checkpoint_path:
self.load(ckpt.model_checkpoint_path)
# mini-batch
tr_loss = []
for idx_begin in range(0,len(feats_tr), 512):
idx_end = idx_begin + batch_size
batch_feats, batch_tgts = feats_tr[idx_begin:idx_end],tgts_tr[idx_begin:idx_end]
_, loss_val = self.sess.run([self.train_step, self.cost_mean], feed_dict = {self.nn_in: batch_feats,
self.nn_tgt: batch_tgts,self.learn_rate: learn_rate})
tr_loss.append(loss_val)
# cross-validation
cv_loss = []
for idx_begin in range(0,len(feats_cv), 512):
idx_end = idx_begin + batch_size
batch_feats, batch_tgts = feats[idx_begin:idx_end],tgts[idx_begin:idx_end]
loss_all.append(self.sess.run(self.cost_mean,
feed_dict = { self.nn_in: batch_feats,
self.nn_tgt: batch_tgts}))
print( "Avg Loss for Training: "+str(np.mean(tr_loss)) + \
" Avg Loss for Validation: "+str(np.mean(cv_loss)) )
# save model per epoch if np.mean(cv_loss) less than previous
if (epoch+1)%1==0:
if loss_new < loss:
loss = loss_new
print( "Model accepted in epoch %d" %(epoch+1) )
# save model to ckpt_dir with mdl_nam
self.saver.save(self.sess, mdl_nam, global_step=epoch+1)
else:
print( "Model rejected in epoch %d" %(epoch+1) )
and I generated a simple annealing learning rate control as: if the average of cross-validation loss is not improved by a certain threshold, then halving the 'learn_late' with initial 0.008.
I checked all the parameters when compared to Kaldi nnet1, and the only difference now is the initialization parameters of weights and biases. I am not sure whether initialization will affect too much. However, the convergence in terms of 'cv_loss' during training in Tensorflow (Avg. CV Loss 1.99) is not good as Kaldi nnet1 (Avg. CV Loss 0.95). Can someone help to point out where I did something wrong or I missed something?
Many thanks in advance !!!
At each epoch, you call self.load(ckpt.model_checkpoint_path) which seems to load previously saved weights.
Your model cannot learn if it is reset to the initial weights at each epoch.