I'm trying to train my own custom object detector. I have the tf records of both test and train and also label_map.tbtxt. there are a lot of issues with each step.
Related
How can I understand which layers are frozen fine-tuning a detection model from Tensorflow Model Zoo 2?
I have already set with success the Path for fine_tune_checkpoint and fine_tune_checkpoint_type: detection and in the file proto I have already read that "detection" means
// 2. "detection": Restores the entire feature extractor.
The only parts of the full detection model that are not restored are the box and class prediction heads.
This option is typically used when you want to use a pre-trained detection model
and train on a new dataset or task which requires different box and class prediction heads.
I didn't really understand what does that means. Restored means Frozen in this context?
As I understand it, currently the Tensorflow 2 Object detection does not freeze any layers when training from a fine tune checkpoint. There is a issue reported here to support specifying which layers to freeze in the pipeline config. If you look at the training step function, you can see that all trainable variables are used when applying gradients during training.
Restored here means that the model weights are copied from the checkpoint to be used as a starting point for training. Frozen would mean that the weights are not changed (i.e. no gradient is applied) during training.
I am new to deep learning, I have a yolov3 model that I have been training on my custom data. Every time I train, the training seems to start from scratch. How do I make the model continue its training from where it stopped last time?
The setup I have is the same as this repo.
You can use model.load_weights(path_to_checkpoint) just after the model is defined at line 41 in train.py and continue training where you left off
I am importing a pretrained mobilenet's model mobilenet_v1_0.25_128_frozen.pb into my tensorflow environment. Once imported, I want to be able to save a snapshot of the model architecture in the form of .png. I know that there is a way to do this in keras with tf.keras.utils.plot_model(model, to_file="model.png"). Is there a way to do this in tensorflow session without using Tensorboard. In case, you recommend using tensorboard, I don't want to separately run tensorboard. I want a way to save the model architecture inside the tensorflow session without starting tensorboard.
I usually just use train.py to train using Tensorflow Object Detection API. However, I read from https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/discussion/68581 that you can also use model_main.py to train your model and see real-time plots and images on Tensorboard.
How do you exactly use model_main.py on Tensorboard?
What is the difference between train.py and model_main.py?
On TensorBoard, the model_main.py output similar graphs like train.py, but in model_main.py, the performance of the model on the evaluation dataset is measured too.
model_main.py is the newer version in TensorFlow Object Detection API. It is used for training and also evaluating the model. When using train.py we have to run a separate program for evaluation (eval.py), while model_main.py executes both. For example, training code will be running for a certain time (for example 5 mins or every 2000 steps), then the training will be stopped and evaluation will be run. After the evaluation has finished, the training will be continued again. Then the same cycle is repeated again.
The newer version of Object Detection API of Tensorflow offers model_main.py that trains as well as evaluates the model using the various pre-conditions and preprocessing where as the older versions of Tensorflow Object Detection APIs uses train.py for training and eval.py for evaluating.
Reference : https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10
I'm using Tensorflow Object Detection API to train an object detection model using transfer learning. Specifically, I'm using ssd_mobilenet_v1_fpn_coco from the model zoo, and using the sample pipeline provided, having of course replaced the placeholders with actual links to my training and eval tfrecords and labels.
I was able able to successfully train a model on my ~5000 images (and corresponding bounding boxes) using the above pipeline (I'm mainly using Google's ML Engine on TPU, if revelant).
Now, I prepared an additional ~2000 images, and would like continue training my model with those new images, without restarting from scratch (training the initial model took ~6h of TPU time). How can I do that?
You have two options, in both you need to change the input_path of the train_input_reader of your new dataset:
When specifying a checkpoint to fine-tune in the training configuration, specify the checkpoint of your trained model
train_config{
fine_tune_checkpoint: <path_to_your_checkpoint>
fine_tune_checkpoint_type: "detection"
load_all_detection_checkpoint_vars: true
}
Simply keep using the same configuration (except the train_input_reader) with the same model_dir of your previous model. That way, the API will create a graph and will check whether a checkpoint already exists in model_dir and fits the graph. If so - it will restore it and continue training it.
Edit: fine_tune_checkpoint_type was previously set as true by mistake, while it should be "detection" or "classification" in general, and "detection" in this specific case. Thanks Krish for noticing.
I haven't retrained the object detection model on a new dataset, but it looks like
increasing the number of training steps train_config.num_steps in the config file and also adding images in the tfrecord files should be enough for that.