How to Re-train custom yolo weights?

How to Re-train custom yolo weights? - object-detection

I have done a custom detection in YOLOv3 over 3 classes, but the detections were not accurate so I want to retrain my custom YOLO weights with more images, but
when I run it with new images it immediately finished, what is it that I have done wrong?
Here how I train it
!./darknet detector train data/obj.data cfg/yolov3_custom.cfg yolov3_custom_last.weights
The content of obj.data:
classes = 3
train = data/train.txt
valid = data/test.txt
names = data/obj.names
backup = /mydrive/yolov3/backup/
The content of yolov3_custom.cfg:
# Training
batch=64
subdivisions=16
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
burn_in=1000
max_batches = 6000
policy=steps
steps=4800,5400
scales=.1,.1

Simply change the path to the weights file in the command for training the model and run it again. There is no need to update the max_batches parameter or change the configuration file in any way. The framework is aware of how many iterations that a given set of weights has been trained for when retraining a model. For instance, if you are to stop after 2000 iterations in the initial training process, when retraining using the set of weights that was created as a result, it will begin at the 2001th iteration.

#Hamid Shatu. You are correct. You Need to update your max_batches= and also check your Weight file Name.

Related

Mask RCNN with custom classes

after dealing with the compatability issues with Mask RCNN I am working now on training the model on my custom classes with the annotations json being in Coco-style.
class modelConfig(Config):
NAME = "my_coco"
NUM_CLASSES = 1 + 6
STEPS_PER_EPOCH = 100
config = modelConfig()
config.display()
model = MaskRCNN(mode='training', model_dir=DEFAULT_LOGS_DIR, config=config)
# load weights (mscoco) and exclude the output layers
model.load_weights(COCO_WEIGHTS_PATH, by_name=True, exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])
# train weights (output layers or 'heads')
model.train(dataset_train, dataset_train, learning_rate=config.LEARNING_RATE, epochs=25, layers='heads')
The problem is that the execution is stuck on epoch 1/N without really starting (even if reducing training dataset to 1 image).
This thread suggests that the problem can be resolved with a custom Layer --> but currently I did not manage to reproduce to resolve the problem.
But My question is do I really need the coco weights if I have custom classes? and should I (how?) initiated weights for these classes? (and of course a viable solution for the problem would be appreciated)

Image classification model re-calibration

I built an image classification model (CNN) using tensorflow-keras. I have some new images which I need to feed into the same model in order to increase the accuracy of the existing model.
I tried using the following code. But it decreases the accuracy.
re_calibrated_model = loaded_model.fit_generator(new_training_set,
steps_per_epoch=int(stp),
epochs=int(epc),
validation_data=new_test_set,
verbose=1,
validation_steps = 50)
Is there any method that I can use to re-calibrate my CNN model?

Your new training session does not start from previous training accuracy if you use completely different dataset to do second training.
You need to feed (old_images+new_images) for your intention.

What I normally do is to train the CNN model on the first batch of images and save that model. If I need to "retrain" the model with additional images, I load the previous saved model from disk and apply the inputs (test and train) and call the fit method. As mentioned before this only works if your outputs are exactly the same i.e. if you are using the same input and output classes.
In my experience training models using different image batches does not necessarily make your model more or less accurate, but rather increase the training time with each batch. Since I am using a CPU to train, my training time is about 30% faster if I train two batches of 1000 images each as oppose to training one batch of 2000 images for example.

Can I continue to training from final .weight with more train and test images?

I trained my custom object detection with darknet yolov3 untill the average loss decreased down to 0.06 but now i want to train it with more training and test images (maybe also deleting some of the image files). Can I do these steps and continue to training with final .weights file or I should start it from the beginning?

Yes, you can use the currently trained model (.weights file) as the pre-trained model for the new training session. For example, if you use AlexeyAB repository you can train your model by a command like this:
darknet.exe detector train data/obj.data yolo-obj.cfg darknet53.conv.74
where darknet53.conv.74 is the pre-trained model.
In the new training session, you can add or remove images. However, the basic configurations should be correct (like the number of classes, etc).
According to the page I mentioned:
in the original repository original repository the
weights-file is saved only once every 10 000 iterations

If you have just modified the data set, but are not interested in changing the model architecture,you can directly resume from the previously saved model using DarkNet in AlexeyAB/darknet. For example,
darknet.exe detector train cfg/obj.data cfg/yolov3.cfg yolov3_weights_last.weights -clear -map
The clear flag will reset iterations saved in the weights, which is appropriate in case of data set changes. That is because the learning rate often depends on the iterations, and you probably don't want to change the configurations.

You need to specify more epochs if you resume. For example if you train to 300/300 then resume will also train to 300 also (starting at 300) unless you specify more epochs..
python train.py --resume

you can resume your training from the previously saved weights, of your custom model.
use the "yolov3_custom_last.weights" instead of the pre-trained default weights.
Incase you find some issues with resuming, try changing the batch size .
this should work and resume your model training with new set of images :)

open the .cfg, find the max_batches code may be in 22 row, set the bigger value:
max_batches = 500200
max_batches is the same to the tranning iteration.
How to continute training after 50000 iteration? #2633

How to train generator from GAN?

After reading GAN tutorials and code samples i still don't understand how generator is trained. Let's say we have simple case:
- generator input is noise and output is grayscale image 10x10
- discriminator input is image 10x10 and output is single value from 0 to 1 (fake or true)
Training discriminator is easy - take its output for real and expect 1 for it. Take output for fake and expect 0. We're working with real output size here - single value.
But training generator is different - we take fake output (1 value) and make expected output for that as one. But it sounds more like training of descriminator again. Output of generator is image 10x10 how can we train it with only 1 single value? How back propagation might work in this case?

To train the generator, you have to backpropagate through the entire combined model while freezing the weights of the discriminator, so that only the generator is updated.
For this, we have to compute d(g(z; θg); θd), where θg and θd are the weights of the generator and discriminator. To update the generator, we can compute the gradient wrt. to θg only ∂loss(d(g(z; θg); θd)) / ∂θg, and then update θg using normal gradient descent.
In Keras, this might look something like this (using the functional API):
genInput = Input(input_shape)
discriminator = ...
generator = ...
discriminator.trainable = True
discriminator.compile(...)
discriminator.trainable = False
combined = Model(genInput, discriminator(generator(genInput)))
combined.compile(...)
By setting trainable to False, already compiled models are not affected, only models compiled in the future are frozen. Thereby, the discriminator is trainable as a standalone model but frozen in the combined model.
Then, to train your GAN:
X_real = ...
noise = ...
X_gen = generator.predict(noise)
# This will only train the discriminator
loss_real = discriminator.train_on_batch(X_real, one_out)
loss_fake = discriminator.train_on_batch(X_gen, zero_out)
d_loss = 0.5 * np.add(loss_real, loss_fake)
noise = ...
# This will only train the generator.
g_loss = self.combined.train_on_batch(noise, one_out)

I guess the best way to understand the Generator training procedure is to revise all training loop.
For each epoch:
Update Discriminator:
forward real images mini-batch pass through the Discriminator;
compute the Discriminator loss and calculate gradients for the backward pass;
generate fake images mini-batch via the Generator;
forward generated fake mini-batch pass through the Discriminator;
compute the Discriminator loss and derive gradients for the backward pass;
add (real mini-batch gradients, fake mini-batch gradients)
update the Discriminator (use Adam or SGD).
Update Generator:
flip the targets: fake images get labeled as real for the Generator. Note: this step ensures using cross-entropy minimization for the Generator. It helps overcome the problem of Generator's vanishing gradients if we continue implementation of the GAN minmax game.
forward fake images mini-batch pass through the updated Discriminator;
compute Generator loss based on the updated Discriminator output, e.g.:
loss function (the probability that fake image is real estimated by Discriminator, 1).
Note: here 1 represents the Generator label for fake images as real.
update the Generator (use Adam or SGD)
I hope this helps. As you can see from the training procedure, GAN players are somewhat "cooperative, in the sense that the discriminator estimates the ratio of data to model distribution densities and then freely shares this information with the generator. From this point of view, the discriminator is more like a teacher instructing the generator in how to improve than an adversary" (cited from I.Goodfellow tutorial).

Can I retrain an old model with new data using TensorFlow?

I am new to TensorFlow and I am just trying to see if my idea is even possible.
I have trained a model with multi class classifier. Now I can classify a sentence in input, but I would like to change the result of CNN, for example, to improve the score of classification or change the classification.
I want to try to train just a single sentence with its class on a trained model, is this possible?

If I understand your question correctly, you are trying to reload a previously trained model either to run it through further iterations, test it on a new sentence, or fine tune the model a bit. If this is the case, yes you can do this. Look into saving and restoring models (https://www.tensorflow.org/api_guides/python/state_ops#Saving_and_Restoring_Variables).
To give you a rough outline, when you initially train your model, after setting up the network architecture, set up a saver:
trainable_var = tf.trainable_variables()
sess = tf.Session()
saver = tf.train.Saver()
sess.run(tf.global_variables_initializer
# Run/train your model until some completion criteria is reached
#....
#....
saver.save(sess, 'model.ckpt')
Now, to reload your model:
saver = tf.train.import_meta_graph('model.ckpt.meta')
saver.restore('model.ckpt')
#Note: if you have already defined all variables before restoring the model, import_meta_graph is not necessary
This will give you access to all the trained variables and you can now feed in whatever new sentence you have. Hope this helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas