retrain a previously trained model in Tensorflow - tensorflow

I am using Tensorflow Object Detection API. I started training a model for a couple of hours and I want to add more images to the dataset and start training again from epoch 1. What would be the best way to do this? I know that if a run now python train.py is it going to continue from the last checkpoint, something that I don't want because I want to retrain over again and with more data. I am thinking to delete the files in the training folder which is where checkpoints and other files related are. I just not sure about what files to delete from that folder or if that is the proper way.
UPDATE: Indeed, I just need to delete all the files (checkpoints) from the training directory and run again python train.py. That would start the training from the step 1

Related

Which checkpoint should I select for continue for Object detection training

I start to training until ckpt-7 then I stopped training. Then again I started training but befor I changed pipline config in fine tune chekpoint on my model. I wrote latest check point and I changed its directory . My loss function approximetly 0.899 before stopped to the training.
When I continue to train but its start to steps 100 and my loss fuction 15.009.
How can I contiune the model before stopped? What should I do?
I am using centernet model with Colab.
Please explain I am new on that topic.
I could understand your question that you could not resume the training where it stopped.
Actually with the updates in TF2, we need not change the finetune checkpoint parameter in the pipeline.config. Re-run the same training script pointing to the same model_dir where your checkpoints are stored.
TF2 will automatically understand and resume from where the training stopped with the help of checkpoints created in the model_dir.

Can you export earlier checkpoints from the tensorflow object detection api v2?

I am training an object detection model up to 10 checkpoints using the tensorflow object detection api version 2. Exporting the final checkpoint using the exporter_main_v2.py works with no problems, however I would also like to export eg checkpoints 3, 6 and 8 to compare how they do in the actual setup. Is this possible?
I've tried deleting the later checkpoints and then running exporter_main_v2.py but this results in an error stating that there are later events in the events.out.tfevents file than the one I'm trying to export so it cant continue.
In your trained folder where checkpoints are, there is checkpoint key file, open that and change the "model_checkpoint_path" checkpoint-number in the first line. Generally it will be saved the last checkpoint.

How to continue training an object detection model using Tensorflow Object Detection API?

I'm using Tensorflow Object Detection API to train an object detection model using transfer learning. Specifically, I'm using ssd_mobilenet_v1_fpn_coco from the model zoo, and using the sample pipeline provided, having of course replaced the placeholders with actual links to my training and eval tfrecords and labels.
I was able able to successfully train a model on my ~5000 images (and corresponding bounding boxes) using the above pipeline (I'm mainly using Google's ML Engine on TPU, if revelant).
Now, I prepared an additional ~2000 images, and would like continue training my model with those new images, without restarting from scratch (training the initial model took ~6h of TPU time). How can I do that?
You have two options, in both you need to change the input_path of the train_input_reader of your new dataset:
When specifying a checkpoint to fine-tune in the training configuration, specify the checkpoint of your trained model
train_config{
fine_tune_checkpoint: <path_to_your_checkpoint>
fine_tune_checkpoint_type: "detection"
load_all_detection_checkpoint_vars: true
}
Simply keep using the same configuration (except the train_input_reader) with the same model_dir of your previous model. That way, the API will create a graph and will check whether a checkpoint already exists in model_dir and fits the graph. If so - it will restore it and continue training it.
Edit: fine_tune_checkpoint_type was previously set as true by mistake, while it should be "detection" or "classification" in general, and "detection" in this specific case. Thanks Krish for noticing.
I haven't retrained the object detection model on a new dataset, but it looks like
increasing the number of training steps train_config.num_steps in the config file and also adding images in the tfrecord files should be enough for that.

how can use torch model?

I have a Torch Model which is trained on a large scale dataset (Places Dataset) and it's authors uploaded it on github, i am working on a similar project and i want to make use of it and use it's trained weights instead of use the large dataset to train it and save time and efforts, it is possible ? how can i know the only the trained filters weights? i don't want to copy the code, i only want to make use of it and save time and efforts.
NOTE: I use Tensoflow in my implementation.

Tensorflow Retrain the retrained model

I am very new to Neural network and tensorflow, just starting on the retrain image tutorial. I have successfully completed the flower_photos training and i have 2 questions.
1.) Is it a good/bad idea to keep building upon a retrained model many times over and over? Or would it be a lot better to train a model fresh everytime? That leads to my second question
2.) If it is ok to retrain a model over and over, for the retrain model tutorial in Tensorflow (Image_retraining), in the retrain.py would i simply replace the classify_image_graph_def.pb and imagenet_synset_to_human_label_map.txt with the one outputted from my retraining? But i see that there is also a imagenet_2012_challenge_label_map_proto.pbtxt, would i have to replace that one with something else?
Thanks for your time