i want to change loss of object detection for ones of object detection (such as SSD) ,
Q1 : i want to know where do i modify the loss function for SSD ,
Q2 : is it possible to fine-tune ssd_mobilenet on my dataset with my define loss ? is it good or must be train ssd_mobile from scratch with my loss function ?
Q1:
If you are using the object detection api then a config is used to define the network and the loss, such as these:
https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs
Looking at a basic ssd mobilenet config you should see the losses it is using, including a classification loss and localization loss. You can look at other configs to see other loss options, or look at the source code for the full list of options or even modify the source code to add your own loss.
Q2:
It is certainly possible, but you will need to dig into the internals of how the object detection api works, modify it to add your loss function and train on your dataset. It will be more work than you might expect. Knowing nothing about your dataset or metric, I expect your fine-tuned result will converge more quickly than a from scratch result and give comparable results.
You can change the loss function in the configuration file like line 198 in the link - https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config , when you do this the performance will be drastically reduced may if you retrain the network performance may improve.
If you can elaborate on the goal more clearly, it would be helpful to suggest the solution.
Related
I am currently trying to build a neural network to evaluate option prices. It is well known that putting no arbitrage constraints inside the loss function of neural network will enhance its out-of-sample performance. To do so, I need to implement the custom loss function described at page 7 of this paper :
https://arxiv.org/pdf/1906.03507.pdf
However, I am unable to consitently do so. From what I have understood so far, this would require to customize the train step.
Does anyone has a code example of something similar or could explain me how I should proceed ?
Currently i am implementing an object detection model for face detection , and tried training my model on just one single instance of data. (Input : image , target : labels).
But after training my model for a long time, its not able to converge optimally. or in other words i should say it was achieving a saddle point from which it couldnt come out. So i was wondering whether its because the data is just one instance and model is somehow not able to learn or something is wrong with my loss function .
I am using yolo architechture and yolo like loss function to train my model.
Adam optimizer for minimizing the loss over time.
thanks,
First of all I want to state out that I am familiar with the benefits of transfer learning. Moreover I am able to train a pretrained model from 'modelzoo' on my dataset. But for research purposes I want to train my model from scratch without transferlearning.
I want to adopt the Faster-RCNN Resnet 101 implementation from tensorsflow's Object Detection API to my dataset. If I use one of the pretrained models the training goes as expected and the loss is always in 'normal' ranges (never above about 6). But if I do not use transferlearning the loss jumps very frequently to extrem high values (about 80,000,000), but between those values the loss is in normal ranges. In addition to this I do not see any predictions of the network on images in TensorBoard. It seems like the network does not make any predictions at all. The only thing which I change is to comment out those two lines in the model.config file:
# fine_tune_checkpoint: 'path'
# from_detection_checkpoint: true
I tried a lot of things to find the reason: Changed optimizer, changed the learning rate, used gradient clipping, changed the initializer used different machines to train on but nothing helps. Moreover I inspected my label_map as well as my record file. To ensure that those files are correct I redid the steps mentioned above by using the pascal voc dataset, the script to create records and the label map from the api, but even with this code from the Object Detection API without any code changes, the loss explodes (Tensorflow Object Detection API own inputs).
I am re-training the SSD MobileNet with 900 images from the Berkeley Deep Drive dataset, and eval towards 100 images from that dataset.
The problem is that after about 24 hours of training, the totalloss seems unable to go below 2.0:
And the corresponding mAP score is quite unstable:
In fact, I have actually tried to train for about 48 hours, and the TotoalLoss just cannot go below 2.0, something ranging from 2.5~3.0. And during that time, mAP is even lower..
So here is my question, given my situation (I really don't need any "high-precision" model, as you can see, I pick 900 images for training and would like to simply do a PoC model training/predication and that's it), when should I stop the training and obtain a reasonably performed model?
indeed for detection you need to finetune the network, since you are using SSD, there are already some sources out there:
https://gluon-cv.mxnet.io/build/examples_detection/finetune_detection.html (This one specifically for an SSD Model, uses mxnet but you can use the same with TF)
You can watch a very nice finetuning intro here
This repo has a nice fine tuning option enabled as long as you write your dataloader, check it out here
In general your error can be attributed to many factors, the learning rate you are using, the characteristics of the images themselves (are they normalized?) If the ssd network you are using was trained with normalized data and you don't normalize to retrain then you'll get stuck while learning. Also what learning rate are they using?
From the model zoo I can see that for SSD there are models trained on COCO
And models trained on Open Images:
If for example you are using ssd_inception_v2_coco, there is a truncated_normal_initializer in the input layers, so take that into consideration, also make sure the input sizes are the same that the ones you provide to the model.
You can get very good detections even with little data if you also include many augmentations and take into account the rest of the things I mentioned, more details on your code would help to see where the problem lies.
I am currently learning to make neural networks with tensorflow. And the library provides a very convenient way to create one with the estimator DNNClassifier like in this tutorial: https://www.tensorflow.org/get_started/premade_estimators.
However, I don't manage to see how to choose the final treshold of the output layer before making the prediction:
For instance, let's say we have a binary classifier between 'KO' and 'OK'. The end of the neural network compute the probabilities for each possibility for a specific sample, for instance [0.4,0.6] (so 40% that the answer is 'KO' and 60% that the answer is 'OK'). I assume that the dnn takes by default a threshold of 0.5, so it will answer 'OK' here. But I want to change this threshold to 0.8 so that if the dnn is not sure at 80% for 'OK', it will answer 'KO' (in order to tune the FP-rate and the FN-rate).
How can we do that ?
Thanks in advance for your help.
The premade estimators are somewhat rigid. The DNNClassifier, for example, does not provide a mechanism to change the loss function or to obtain the logits/probabilities output by the classifier, as you've discovered.
To modify the logic of how predictions are generated, or to modify your loss function, you'll have to create a custom Estimator. This tutorial walks you through that process.
If you haven't invested too much time learning how to use the Estimator API yet, I recommend you also acquaint yourself with Keras, another high-level API for building and training deep learning models in TensorFlow; you might find it easier to build custom models with Keras rather than Estimators.