How to save two different checkpoints during training in Tensorflow - tensorflow

Assume we have test/val/train splits. During training, we want to save some model checkpoints [save_1] that can be used to restart the training later.
In addition, we want to save another model during training that shows the best performance on the validation sets [save_2]. After done with training, we use save_2 to report the performance on the test data.
My question is that how we can have two different tf.savers during training in TensorFlow? whatever examples that I have seen, only save [save_1].
Pointer to any codes would be appreciated.
Thanks.

You can get quite close by using an Estimator to wrap your model. Specifically see the options surrounding saving multiple checkpoints in RunConfig (you don't have to throw any away). Could be combined with a ValidationMonitor to find the lowest validation error.

Related

Using dynamically generated data with keras

I'm training a neural network using keras but I'm not sure how to feed the training data into the model in the way that I want.
My training data set is effectively infinite, I have some code to generate training examples as needed, so I just want to pipe a continuous stream of novel data into the network. keras seems to want me to specify my entire dataset in advance by creating a numpy array with everything in it, but this obviously wont work with my approach.
I've experimented with creating a generator class based on keras.utils.Sequence which seems like a better fit, but it still requires me to specify a length via the __len__ method which makes me think it will only create that many examples before recycling them. Can someone suggest a better approach?

Return both instance id and prediction from predict() method of a Keras model

Supposing I have a Keras model which is already trained. When using predict() method, I want to get the instance key value and corresponding prediction at the same time( I can pass key value as a feature/column in the input).
I wonder is it realistic to do that?
I struggled with this for a while. I'm using the tf.data.Dataset infrastructure so my first approach was to see if I could ensure that the order of the examples produced by the datasets was deterministic, but that wasn't optimal because it gave up a bunch of the parallel processing performance benefits and ended up not being the case in any event. I ended up processing predictions using model.predict_on_batch feeding in batches iterated out of the dataset manually instead of feeding the entire dataset into model.predict. That way I was able to grab the ids from the batch and associate them with the returned prediction.
I was surprised there wasn't a more ready made solution to a problem that must come up a lot. I haven't gotten up to speed on the Estimator interface or custom training/prediction loops yet, but hopefully this problem becomes trivial there.

Updating Tensorflow Object detection model with new images

I have trained a faster rcnn model with a custom dataset using Tensorflow's Object Detection Api. Over time I would like to continue to update the model with additional images (collected weekly). The goal is to optimize for accuracy and to weight newer images over time.
Here are a few alternatives:
Add images to previous dataset and train a completely new model
Add images to previous dataset and continue training previous model
New dataset with just new images and continue training previous model
Here are my thoughts:
option 1: would be more time consuming, but all images would be treated "equally".
Option 2: would like take less additional training time, but one concern is that the algorithm might be weighting the earlier images more.
Option 3: This seems like the best option. Take original model and simply focus on training the new stuff.
Is one of these clearly better? What would be the pros/cons of each?
In addition, I'd like to know if it's better to keep one test set as a control for accuracy or to create a new one each time that includes newer images. Perhaps adding some portion of new images to model and another to the test set, and then feeding older test set images back into model (or throwing them out)?
Consider the case where your dataset is nearly perfect. If you ran the model on new images (collected weekly), then the results (i.e. boxes with scores) would be exactly what you want from the model and it would be pointless adding these to the dataset because the model would not be learning anything new.
For the imperfect dataset, results from new images will show (some) errors and these are appropriate for further training. But there may be "bad" images already in the dataset and it is desirable to remove these. This indicates that Option 1 must occur, on some schedule, to remove entirely the effect of "bad" images.
On a shorter schedule, Option 3 is appropriate if the new images are reasonably balanced across the domain categories (in some sense a representative subset of the previous dataset).
Option 2 seems pretty safe and is easier to understand. When you say "the algorithm might be weighting the earlier images more", I don't see why this is a problem if the earlier images are "good". However, I can see that the domain may change over time (evolution) in which case you may well wish to counter-weight older images. I understand that you can modify the training data to do just that as discussed in this question:
Class weights for balancing data in TensorFlow Object Detection API

Saving Estimator models with least validation error and restoring for training

I would like to be able to save the Estimator model during training whenever the error decreases on the validation set and keep a certain number of these best performing models, in case something goes wrong during training (like overfitting).
The available options seem to be implementing a Hook, an Exporter or using the EstimatorSpecs returned by Estimator.evaluate. However:
I can not use a Hook, since I need the mean loss on the whole validation set and Hooks are called after each step.
With an exporter, I can export the model to the SavedModel format, but I can't seem to find anything about restoring the model from this format and continuing training, with an Estimator. Though, it seems to be possible with a session.
The EstimatorSpecs returned also can't be used since I can't save the model with tf.train.Saver without the estimator's session.
Also, I believe the best validation error found so far should somehow be serialized on disk. This way, whenever I restore the model and resume training, the component responsible with saving the best models knows the best validation error found so far, so that instead of starting with a flag value and saving the first model as being 'the best', it continues from where it left off.
Is there a way I can achieve this?

Tensorflow Object Detection API - What's actually test.record being used for?

I have a few doubts about Tensorflow Object Detection API. Hopefully someone can help me out... Before that, I need to mention that I am following what sendex is doing. So basically, the steps are come from him.
First doubt: Why we need test.record for training? What it does during training?
Second doubt: Sendex is getting images from test.record to test the newly trained model, doesn't the model already knew that images because they are from test.record?
Third doubt: In what type of occasion we need to activate drop_out (in the .config file)?
1) It does nothing during training, you dont need that during training, but at certain time the model begins to overfit. It means the loss on training images continues to go down but the accuracy on testing images stops improving and begins to decline. This is the time when it is needed to stop traininga nd to recognise this moment you need the test.record.
2) Images were used only to evaluate model during training not to train the net.
3) You do not need to activate it, but using dropout you usually achieve higher accuracy. It prevents the net from overfitting.