How to choose the threshold of the output of a dnn in tensorflow? - tensorflow

I am currently learning to make neural networks with tensorflow. And the library provides a very convenient way to create one with the estimator DNNClassifier like in this tutorial: https://www.tensorflow.org/get_started/premade_estimators.
However, I don't manage to see how to choose the final treshold of the output layer before making the prediction:
For instance, let's say we have a binary classifier between 'KO' and 'OK'. The end of the neural network compute the probabilities for each possibility for a specific sample, for instance [0.4,0.6] (so 40% that the answer is 'KO' and 60% that the answer is 'OK'). I assume that the dnn takes by default a threshold of 0.5, so it will answer 'OK' here. But I want to change this threshold to 0.8 so that if the dnn is not sure at 80% for 'OK', it will answer 'KO' (in order to tune the FP-rate and the FN-rate).
How can we do that ?
Thanks in advance for your help.

The premade estimators are somewhat rigid. The DNNClassifier, for example, does not provide a mechanism to change the loss function or to obtain the logits/probabilities output by the classifier, as you've discovered.
To modify the logic of how predictions are generated, or to modify your loss function, you'll have to create a custom Estimator. This tutorial walks you through that process.
If you haven't invested too much time learning how to use the Estimator API yet, I recommend you also acquaint yourself with Keras, another high-level API for building and training deep learning models in TensorFlow; you might find it easier to build custom models with Keras rather than Estimators.

Related

Building custom keras loss function with use of gradients of output with respect to output

I am currently trying to build a neural network to evaluate option prices. It is well known that putting no arbitrage constraints inside the loss function of neural network will enhance its out-of-sample performance. To do so, I need to implement the custom loss function described at page 7 of this paper :
https://arxiv.org/pdf/1906.03507.pdf
However, I am unable to consitently do so. From what I have understood so far, this would require to customize the train step.
Does anyone has a code example of something similar or could explain me how I should proceed ?

Change the spatial input dimension during training

I am training a yolov4 (fully convolutional) in tensorflow 2.3.0.
I would like to change the spatial input shape of the network during training, to further adjust the weights to different scales.
Is this possible?
EDIT:
I know of the existence of darknet, but it suffers from some very specific augmentations I use and have implemented in my repo, that is why I ask explicitly for tensorflow.
To be more precisely about what I want to do.
I want to train for several batches at Y1xX1xC then change the input size to Y2xX2xC and train again for several batches and so on.
It is not possible. In the past people trained several networks for different scales but the current state-of-the-art approach is feature pyramids.
https://arxiv.org/pdf/1612.03144.pdf
Another great candidate is to use dilated convolution which can learn long distance dependencies among pixels with varying distance. You can concatenate the outputs of them and the model will then learn which distance is important for which case
https://towardsdatascience.com/review-dilated-convolution-semantic-segmentation-9d5a5bd768f5
It's important to mention which TensorFlow repository you're using. You can definitely achieve this. The idea is to keep the fixed spatial input dimension in a single batch.
But even better approach is to use the darknet repository from AlexeyAB: https://github.com/AlexeyAB/darknet
Just set, random = 1 https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov4.cfg [line 1149]. It will train your network with different spatial dimensions randomly.
One thing you can do is, start your training with AlexeyAB repo with random=1 set, then take the trained weights file to tensorflow for fine-tuning.

It is possible to change loss in Tensorflow object detection api?

i want to change loss of object detection for ones of object detection (such as SSD) ,
Q1 : i want to know where do i modify the loss function for SSD ,
Q2 : is it possible to fine-tune ssd_mobilenet on my dataset with my define loss ? is it good or must be train ssd_mobile from scratch with my loss function ?
Q1:
If you are using the object detection api then a config is used to define the network and the loss, such as these:
https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs
Looking at a basic ssd mobilenet config you should see the losses it is using, including a classification loss and localization loss. You can look at other configs to see other loss options, or look at the source code for the full list of options or even modify the source code to add your own loss.
Q2:
It is certainly possible, but you will need to dig into the internals of how the object detection api works, modify it to add your loss function and train on your dataset. It will be more work than you might expect. Knowing nothing about your dataset or metric, I expect your fine-tuned result will converge more quickly than a from scratch result and give comparable results.
You can change the loss function in the configuration file like line 198 in the link - https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config , when you do this the performance will be drastically reduced may if you retrain the network performance may improve.
If you can elaborate on the goal more clearly, it would be helpful to suggest the solution.

Quantization aware training examples?

I want to do quantization-aware training with a basic convolutional neural network that I define directly in tensorflow (I don't want to use other API's such as Keras). The only ressource that I am aware of is the readme here:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize
However its not clear exactly where the different quantization commands should go in the overall process of training and then freezing the graph for actual inference.
Therefore I am wondering if there is any code example out there that shows how to define, train, and freeze a simple convolutional neural network with quantization aware training in tensorflow?
It seems that others have had the same question as well, see for instance here.
Thanks!

Neural Network - how to test that it is implemented properly?

I've implemented the Neural Network using Tensorflow. During the implementation and training, I've found several not-so-trivial bugs.
Example: during the training I had same Mini-Batch loss for different steps/epochs, but different accuracy.
Now the neural network seems to be ready and working properly. I haven't managed to train it well yet, but I am working on it.
Anyway, I would like to check somehow that I haven't done any computational errors there. I am thinking about generating some artificial data for "fake" classification problem with lets say 4 features. The classification should have a very clear human-understandable dependency between the classification output and 4 features. The idea is to try to train the NN on it and see how it performs.
What do you think?
Stanford's c231n has a couple of general tips for this, like gradient checking.
If you're just learning neural networks, why don't you try to run your implementation on some known data? Many courses provide error and loss curves form models with specified hyperparameters, so you can check whether your implementation's behavior differs significantly from correct implementation.