How to penalize the loss of one class more than the other in tensorflow for a multi class problem? - tensorflow

Let's say my model has two classes Class 1 and Class 2. Both Class 1 and Class 2 has a equal amount of training and testing data. But I want to penalize the loss of the Class 1 more than Class 2, so that one class has a fewer number of False Positives than the other (I want the model to perform better for one class than the other).
How do I achieve this in Tensorflow?

The thing you are looking for is probably
weighted_cross_entropy.
It is giving a very closely related contextual information, similar to #Sazzad 's answer, but specific to TensorFlow. To quote the documentation:
This is like sigmoid_cross_entropy_with_logits() except that
pos_weight, allows one to trade off recall and precision by up- or
down-weighting the cost of a positive error relative to a negative
error.
It accepts an additional argument pos_weights. Also note that this is only for binary classification, which is the case in the example you described. If there might be other classes besides the two, this would not work.

If I understand your question correctly, this is not a tensorflow concept. you can write your own. for binary classification, the loss is something like this
loss = ylogy + (1-y)log(1-y)
Here class 0 and class 1 have the same weight in the loss. So you can give more give more weight to some portion. for example,
loss = 5 * ylogy + (1-y)log(1-y)
Hope it answers your question.

Related

Loss function for ordinal multi classification in pytorch

I am a beginner with DNN and pytorch.
I am dealing with a multi-classification problem where my label are encoded into a one-hotted vector, say of dimension D.
To this end, I am using the CrossEntropyLoss. However now I want to modify or change such criterion in order to penalize value distant from the actual one, say classify 4 instead of 5 is better than 2 instead of 5.
Is there a function already built-in in Pytorch that implement this behavior? Otherwise how can I modify the CrossEntropyLoss to achieve it?
This could help you. It is a PyTorch implementation ordinal regression:
https://www.ethanrosenthal.com/2018/12/06/spacecutter-ordinal-regression/

Balance Dataset for Tensorflow Object Detection

I currently want to use Tensorflows Object Detection API for my custom problem.
I already created the dataset, but its pretty unbalanced.
The Dataset has 3 classes and my main problem is, that one class has about 16k samples and another class has only about 2.5k samples.
So I think I have to balance the dataset. Someone told me, that there is something called sample/class weights(Not sure if this is 100% correct), which balance the samples for training, so that the biggest class has a smaller impact on training then the smallest class.
I'm not able to find this method for balancing. Can someone pleas give me a hint where to start?
You can do normal cross entropy, giving you a ? x 1 tensor, X of losses
If you want class number N to count T times more, you can do
X = X * tf.reduce_sum(tf.multiply(one_hot_label, class_weight), axis = 1)
tf.multiply
scales the label by whatever weight you want,
tf.reduce_sum
converts the label vector a to a scalar, so you end up with a ? x 1 tensor filled with the class weightings. Then you simply multiply the tensor of losses with the tensor of weightings to achieve desired results.
Since one class is 6.4 times more common than the other, I would apply the weightings 1 and 6.4 to the more common and less common class respectively. This will mean that every time the less common class occurs, it has 6.4 times the affect of the more common class, so it's like it saw the same number of samples from each.
You might want to modify it so that the weighting add up to the number of classes. This matches the default case is all of the weightings are 1. In that case we have 1 /7.4 and 6.4/7.4

RNN LSTM Keras custom loss function

I'm beginning with Keras and TensorFlow.
I have an LSTM model learning on a dataset of stocks prices.
I don't want that my model learn to predict next steps like today. I want that my model learn on each step if it must buy, sell or do nothing and how much.
I think that I need to make a custom loss function, but I really don't know how to code my concept : buy, sell, nothing and how much based on a capital like 100 unit at beginning. The objective would be to have the hightest capital possible at the end.
I must to use an existant function and customise it like MSE ? If yes, how ?
I must to let my model learn the time series and after add a buy/sell layer(s) ? If yes, how ?
Other ?
I am pretty lost.
Thank's a lot for your help.
Sam
I would try categorical cross-entropy,
I mean you have three options: buy (0) , sell (1), and do nothing (2). You can encode it like this:
[1,0,0] < - means 'buy'
[0,1,0] < - means 'sell'
[0,0,1] < - means 'do nothing'
And don't forget to add softmax function in the end of you NN.
what I understand, we have stock prices dataset and at each point, we are required to predict the decision buy/sell/nothing.
For each point, we should decide a window size, which we believe impact the current point.
Use this window as time series input to LSTM layer. Using moving window, we can create multiple inputs. The corresponding output will be the decision, which can be taken to be encoded 3 bits.
For time point t, use time series (0..t-1) as input. and decision [0,0,1] or [0,1,0] or [1,0,0] as output. The model will learn to predict probabilities for each decision.
To compute loss, categorical cross entropy will be useful, as mentioned by Paddy.
Also, if you haven't looked into pre-processing data, detrending the data is useful in such cases. This link might be useful.

Imbalanced Dataset for Multi Label Classification

So I trained a deep neural network on a multi label dataset I created (about 20000 samples). I switched softmax for sigmoid and try to minimize (using Adam optimizer) :
tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y_, logits=y_pred)
And I end up with this king of prediction (pretty "constant") :
Prediction for Im1 : [ 0.59275776 0.08751075 0.37567005 0.1636796 0.42361438 0.08701646 0.38991812 0.54468459 0.34593087 0.82790571]
Prediction for Im2 : [ 0.52609032 0.07885984 0.45780018 0.04995904 0.32828355 0.07349177 0.35400775 0.36479294 0.30002621 0.84438241]
Prediction for Im3 : [ 0.58714485 0.03258472 0.3349618 0.03199361 0.54665488 0.02271551 0.43719986 0.54638696 0.20344526 0.88144571]
At first, I thought I just neeeded to find a threshold value for each class.
But I noticed that, for instance, among my 20000 samples, the 1st class appears about 10800 so a 0.54 ratio and it the value around which my prediction is every time. So I think I need to find a way to tackle tuis "imbalanced datset" issue.
I thought about reducing my dataset (Undersampling) to have about the same number of occurence for each class but only 26 samples correspond to one of my classes... That would make me loose a lot of samples...
I read about oversampling or about penalizing even more the classes that are rare but did not really understood how it works.
Can someone share some explainations about these methods please ?
In practice, on Tensorflow, are there functions that help doing that ?
Any other suggestions ?
Thank you :)
PS: Neural Network for Imbalanced Multi-Class Multi-Label Classification This post raises the same problem but had no answer !
Well, having 10000 samples in one class and just 26 in a rare class will be indeed a problem.
However, what you experience, to me, seems more like "outputs don't even see the inputs" and thus the net just learns your output distribution.
To debug this I would create a reduced set (just for this debugging purpose) with say 26 samples per class and then try to heavily overfit. If you get correct predictions my thought is wrong. But if the net cannot even detect those undersampled overfit samples then indeed it's an architecture/implementation problem and not due to the schewed distribution (which you will then need to fix. But it'll be not as bad as your current results).
Your problem is not the class imbalance, rather just the lack of data. 26 samples are considered to be a very small dataset for practically any real machine learning task. A class imbalance could be easily handled by ensuring that each minibatch will have at least one sample from every class (this leads to situations when some samples will be used much more frequently than another, but who cares).
However, in the case of presence only 26 samples this approach (and any other) will quickly lead to overfitting. This problem could be partly solved with some form of data augmentation, but there still too few samples to construct something reasonable.
So, my suggestion will be to collect more data.

Dealing with imbalanced data by using weight

I have very imbalanced data and the goal is classification. At the first i want to check undersampling on the majority class. Class 1 with 600, class2 90, class3 60 and class4 96 sample data!!!
Using weight: In 2 fold cross validation and Randomforest model:
Why by using weight, the result isn't better?
This is my code: cfr = RandomForestClassifier(n_estimators=100,n_jobs=5,class_weight={1:1,2:30,3:30,4:30})
Is there any thing wrong in my code? Could u please guide me?
The actual question is what is your task. Is your task to maximize the accuracy of the model, even though you have a huge disproportion of classes? If so, you should not undersample test set. In fact you never under- or oversample test set, you might however, in some cases - add weights to particular classes to make a correction for true priors (which might be different from the empircal ones) or due to cost sensitive learning.