Training trained seq2seq model on additional training data - tensorflow

I have trained a seq2seq model with 1M samples and saved the latest checkpoint. Now, I have some additional training data of 50K sentence pairs which has not been seen in previous training data. How can I adapt the current model to this new data without starting the training from scratch?

You do not have to re-run the whole network initialization. You may run an incremental training.
Training from pre-trained parameters
Another use case it to use a base model and train it further with new training options (in particular the optimization method and the learning rate). Using -train_from without -continue will start a new training with parameters initialized from a pre-trained model.
Remember to tokenize your 50K corpus the same way you tokenized the previous one.
Also, you do not have to use the same vocabulary beginning with OpenNMT 0.9. See the Updating the vocabularies section and use the appropriate value with -update_vocab option.

Related

Training with spacy on full dataset

When I train my spacy model as follows
spacy train config.cfg --paths.train ./train.spacy --paths.dev ./dev.spacy
the model gets trained on train.spacy data file, and scored on dev.spacy. Then output_updated/model-best is the model with the highest score.
Is this best model finally trained on a combination of both train and dev data? I understand, it makes sense to split those datasets to avoid overfitting, but given little training data, I would like the final model to be trained on all data I have at hand.
No, spaCy does not automatically merge your datasets before training model-best. If you want to do that you would need to manually create a new training data set.
If you have so little data that seems like a good idea, you should probably prioritize getting more data.

Image classification model re-calibration

I built an image classification model (CNN) using tensorflow-keras. I have some new images which I need to feed into the same model in order to increase the accuracy of the existing model.
I tried using the following code. But it decreases the accuracy.
re_calibrated_model = loaded_model.fit_generator(new_training_set,
steps_per_epoch=int(stp),
epochs=int(epc),
validation_data=new_test_set,
verbose=1,
validation_steps = 50)
Is there any method that I can use to re-calibrate my CNN model?
Your new training session does not start from previous training accuracy if you use completely different dataset to do second training.
You need to feed (old_images+new_images) for your intention.
What I normally do is to train the CNN model on the first batch of images and save that model. If I need to "retrain" the model with additional images, I load the previous saved model from disk and apply the inputs (test and train) and call the fit method. As mentioned before this only works if your outputs are exactly the same i.e. if you are using the same input and output classes.
In my experience training models using different image batches does not necessarily make your model more or less accurate, but rather increase the training time with each batch. Since I am using a CPU to train, my training time is about 30% faster if I train two batches of 1000 images each as oppose to training one batch of 2000 images for example.

Tensorflow : Is it possible to identify the data is used for training?

I have created text classification model(.pb) using tensorflow. Prediction is good.
Is it possible to check the sentence using for prediction is already used to train the model or not. I need to retrain the model when new sentence is given to model to predict.
I did some research and couldn't find a way to get the train data only with the pb file because that file only stores the features and not the actual train data(obviously),but if you have the dataset,then you can easily verify duh....
I don't think you can ever find the exact train data with only the trained model,cause the model only contains the features and not the actual train data

when to stop training object detection tensorflow

I am training faster rcnn model on fruit dataset using a pretrained model provided in google api(faster_rcnn_inception_resnet_v2_atrous_coco).
I made few changes to the default configuration. (number of classes : 12 fine_tune_checkpoint: path to the pretrained checkpoint model and from_detection_checkpoint: true). Total number of annotated images I have is around 12000.
After training for 9000 steps, the results I got have an accuracy percent below 1, though I was expecting it to be atleast 50% (In evaluation nothing is getting detected as accuracy is almost 0). The loss fluctuates in between 0 and 4.
What should be the number of steps I should train it for. I read an article which says to run around 800k steps but its the number of step when you train from scratch?
FC layers of the model are changed because of the different number of the classes but it should not effect those classes which are already present in the pre-trained model like 'apple'?
Any help would be much appreciated!
You shouldn't look at your training loss to determine when to stop. Instead, you should run your model through the evaluator periodically, and stop training when the evaluation mAP stops improving.

Using tensorflow input pipline with GAN

I want to build a conditional GAN with tensorflow and use input pipline for loading my dataset. The problem is that in each iteration I want to the use same data batch for training both generative and discriminative models, but because their training operators are fetched in different runs they will receive different batches of data. Is there any solution for that or should I use a feed_dict?
One way to use the same data is to use a tf.group on the generator and discriminator train ops so they are trained jointly, and set use_locking=True on your optimizers to prevent pathological race conditions. Note that there still will be some stochasticity due to the fact that TensorFlow runtime won't guarantee that either the generator or the discriminator will consistently be trained first.
This idea is already implemented in TensorFlow's TFGAN library in get_joint_train_hooks, although it uses hooks instead of grouping the training ops (the "joint" refers to the fact that the discriminator and generator are trained jointly, rather than sequentially).