Training Estimators less than one epoch using dataset API? - tensorflow

I am trying to train a model on a large dataset. I would like to run the evaluation step multiple times before one epoch of training has been completed. Looking at the implementation of Dataset API with Estimators it looks like every time I restart the training after the evaluation step, Estimator creates a fresh dataset from scratch and the training never completes for the full data.
I have written an input function very similar to one provided on the tensorflow website.
def train_input_fn(features, labels, batch_size):
"""An input function for training"""
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices((dict(features),
labels))
# Shuffle, repeat, and batch the examples.
dataset = dataset.repeat(1).batch(batch_size)
# Return the read end of the pipeline.
return dataset
I then use the tf.estimator.Estimator.train to call my input function. I call the above input function with the following method.
classifier.train(input_fn=lambda: train_input_fn,
steps=n_steps)
where n_steps in number less than the total step taken to complete one epoch.
I then call an evaluation function like this.
classifier.evaluate(input_fn=lambda: eval_input_fn())
I want the run both the step in a loop.
Every time the loop reaches training, It initialization the dataset in the train_input_fn. This applies the training only in first n_steps of training data.

If you want to evaluate multiple times during training, you can check InMemoryEvaluatorHook.
You can probably refer this discussion about train_and_evaluate and InMemoryEvaluatorHook for more details on how to better use them.

Related

Using tf.data.Dataset in keras

I have a model written in Keras. Because I'm dealing with large files, I'm using tf.data.Dataset api to load data and feed into the Keras fit function. Before I call model.fit(), I reinitialize the dataset using it=ds.make_initializable_iterator() and then pass the X, and y tensors that I get from the it.get_next() function to the model.fit(). The problem is that, when model.fit() reaches the end of the dataset, it does not continue training, in other words, I can only train for ONE epoch, not matter what I pass as the "epochs" variable to the fit function.
How can I tell Keras to reinitialize the iterator when it reaches the end of the dataset?
Use the function dataset.repeat(n_epochs) to repeat your dataset for the number of epochs.
The epochs argument in the model.fit function defines how many times to iterate over the dataset. If you do not repeat the dataset, however, you will run out of samples after the first epoch. You can use dataset.repeat(n_epochs) to repeat for n_epochs, or you can use dataset.repeat() to repeat infinitely.

Implementing stochastic forward passes in part of a neural network in Keras?

my problem is the following:
I am working on an object detection problem and would like to use dropout during test time to obtain a distribution of outputs. The object detection network consists of a training model and a prediction model, which wraps around the training model. I would like to perform several stochastic forward passes using the training model and combine these e.g. by averaging the predictions in the prediction wrapper. Is there a way of doing this in a keras model instead of requiring an intermediate processing step using numpy?
Note that this question is not about how to enable dropout during test time
def prediction_wrapper(model):
# Example code.
# Arguments
# model: the training model
regression = model.outputs[0]
classification = model.outputs[1]
predictions = # TODO: perform several stochastic forward passes (dropout during train and test time) here
avg_predictions = # TODO: combine predictions here, e.g. by computing the mean
outputs = # TODO: do some processing on avg_predictions
return keras.models.Model(inputs=model.inputs, outputs=outputs, name=name)
I use keras with a tensorflow backend.
I appreciate any help!
The way I understand, you're trying to average the weight updates for a single sample while Dropout is enabled. Since dropout is random, you would get different weight updates for the same sample.
If this understanding is correct, then you could create a batch by duplicating the same sample. Here I am assuming that the Dropout is different for each sample in a batch. Since, backpropagation averages the weight updates anyway, you would get your desired behavior.
If that does not work, then you could write a custom loss function and train with a batch-size of one. You could update a global counter inside your custom loss function and return non-zero loss only when you've averaged them the way you want it. I don't know if this would work, it's just an idea.

`estimator.train` with num_steps in Tensorflow

I have made a custom estimator in Tensorflow 1.4. In estimator.trainfunction, I see a steps parameter, which I am using as a way to stop the training and then evaluate on my validation dataset.
while True:
model.train(input_fn= lambda:train_input_fn(train_data), steps = FLAGS.num_steps)
model.evaluate(input_fn= lambda:train_input_fn(test_data))
After every num_steps, I run evaluate on validation dataset.
What I am observing is, after num_steps, once the evaluation is done, there is a jerk in the plot of AUC/Loss functions(in general all metric).
Plot attached :
I am unable to understand why it's happening.
Is it not the right way to evaluate metrics on validation dataset at regular intervals
Link to code
The issue
The issue comes from the fact that what you plot in TensorBoard is the accuracy or AUC computed since the beginning of estimator.train.
Here is what happens in details:
you create a summary based on the second output of tf.metrics.accuracy
accuracy = tf.metrics.accuracy(labels, predictions)
tf.summary.scalar('accuracy', accuracy[1])
when you call estimator.train(), a new Session is created and all the local variables are initialized again. This includes the local variables of accuracy (sum and count)
during this Session, the op tf.summary.merge_all() is called at regular intervals. What happens is that your summary is the accuracy of all the batches processed since you last called estimator.train(). Therefore, at the beginning of each training phase, the output is pretty noisy and it gets more stable once you progress.
Whenever you evaluate and call estimator.train() again, the local variables are initialized again and you go in a short "noisy" phase, which results in bumps on the training curve.
A solution
If you want a scalar summary that gives you the actual accuracy for each batch, it seems like you need to implement it without using tf.metrics. For instance, if you want the accuracy you will need to do:
accuracy = tf.reduce_mean(tf.cast(tf.equal(labels, predictions), tf.float32))
tf.summary.scalar('accuracy', accuracy)
It is easy to implement this for the accuracy, and I know it might be painful to do for AUC but I don't see a better solution for now.
Maybe having these bumps is not so bad. For instance if you train on one epoch, you will get the overall training accuracy on one epoch at the end.

Tensorflow: How to get the accuracy/prediction for whole test dataset? not for each batch

I am trying to use Tensorboard to visualize my testing procedure. My purpose is, when every epoch completed, I would like to test the network's accuracy using the whole test dataset, and store this accuracy result into a summary file, so that I can visualize it in Tensorboard.
Tensorflow has summary_op to do it, however (all the existing examples) seems only work for one batch when running the code sess.run(summary_op). I need to calculate the accuracy for the whole test dataset. How can I do that?
Is there any example to do it? Any help will be appreciated.
You could calculate it by:
Batching your test dataset in case it is too large; e.g. into n_test_batches and start with a buffer like buffer_accuracies = 0.0
Adding the batch accuracies into the buffer variable buffer_accuracies
Finally when you processed the whole test dataset divide buffer_accuracies by total number of test_batches
Now you would have test_accuracy = buffer_accuracies/n_test_batchesas a regular python variable
No we can create a summary for that python variable as follows
test_accuracy_summary = tf.Summary()
test_accuracy_summary.add(tag="test_accuracy", simple_value = test_accuracy)
Finally write that into your tensorflow FileWriter e.g.
test_writer.add_summary(test_accuracy_summary,iteration_step)

What is the difference between model.fit() an model.evaluate() in Keras?

I am using Keras with TensorFlow backend to train CNN models.
What is the between model.fit() and model.evaluate()? Which one should I ideally use? (I am using model.fit() as of now).
I know the utility of model.fit() and model.predict(). But I am unable to understand the utility of model.evaluate(). Keras documentation just says:
It is used to evaluate the model.
I feel this is a very vague definition.
fit() is for training the model with the given inputs (and corresponding training labels).
evaluate() is for evaluating the already trained model using the validation (or test) data and the corresponding labels. Returns the loss value and metrics values for the model.
predict() is for the actual prediction. It generates output predictions for the input samples.
Let us consider a simple regression example:
# input and output
x = np.random.uniform(0.0, 1.0, (200))
y = 0.3 + 0.6*x + np.random.normal(0.0, 0.05, len(y))
Now lets apply a regression model in keras:
# A simple regression model
model = Sequential()
model.add(Dense(1, input_shape=(1,)))
model.compile(loss='mse', optimizer='rmsprop')
# The fit() method - trains the model
model.fit(x, y, nb_epoch=1000, batch_size=100)
Epoch 1000/1000
200/200 [==============================] - 0s - loss: 0.0023
# The evaluate() method - gets the loss statistics
model.evaluate(x, y, batch_size=200)
# returns: loss: 0.0022612824104726315
# The predict() method - predict the outputs for the given inputs
model.predict(np.expand_dims(x[:3],1))
# returns: [ 0.65680361],[ 0.70067143],[ 0.70482892]
In Deep learning you first want to train your model. You take your data and split it into two sets: the training set, and the test set. It seems pretty common that 80% of your data goes into your training set and 20% goes into your test set.
Your training set gets passed into your call to fit() and your test set gets passed into your call to evaluate(). During the fit operation a number of rows of your training data are fed into your neural net (based on your batch size). After every batch is sent the fit algorithm does back propagation to adjust the weights in your neural net.
After this is done your neural net is trained. The problem is sometimes your neural net gets overfit which is a condition where it performs well for the training set but poorly for other data. To guard against this situation you run the evaluate() function to send new data (your test set) through your neural net to see how it performs with data it has never seen. There is no training occurring, this is purely a test. If all goes well then the score from training is similar to the score from testing.
fit(): Trains the model for a given number of epochs (this is for training time, with the training dataset).
predict(): Generates output predictions for the input samples (this is for somewhere between training and testing time).
evaluate(): Returns the loss value & metrics values for the model in test mode (this is for testing time, with the testing dataset).
While all the above answers explain what these functions : fit(), evaluate() or predict() do however more important point to keep in mind in my opinion is what data you should use for fit() and evaluate().
The most clear guideline that I came across in Machine Learning Mastery and particular quote in there:
Training set: A set of examples used for learning, that is to fit the parameters of the classifier.
Validation set: A set of examples used to tune the parameters of a classifier, for example to choose the number of hidden units in a neural network.
Test set: A set of examples used only to assess the performance of a fully-specified classifier.
: By Brian Ripley, page 354, Pattern Recognition and Neural Networks, 1996
You should not use the same data that you used to train(tune) the model (validation data) for evaluating the performance (generalization) of your fully trained model (evaluate).
The test data used for evaluate() should be unseen/not used for training(fit()) in order to be any reliable indicator of model evaluation (for generlization).
For Predict() you can use just one or few example(s) that you choose (from anywhere) to get quick check or answer from your model. I don't believe it can be used as sole parameter for generalization.
One thing which was not mentioned here, I believe needs to be specified. model.evaluate() returns a list which contains a loss figure and an accuracy figure. What has not been said in the answers above, is that the "loss" figure is the sum of ALL the losses calculated for each item in the x_test array. x_test would contain your test data and y_test would contain your labels. It should be clear that the loss figure is the sum of ALL the losses, not just one loss from one item in the x_test array.
I would say the mean of losses incurred from all iterations, not the sum. But sure, that's the most important information here, otherwise the modeler would be slightly confused.