variational autoencoder with limited data - tensorflow

Im working on a binary classificaton project, and im using VAE (variational autoencoder) to handle the imbalance between the 2 classes by generating new samples for the minority class.
the first class (majority class) contains 20000 samples, and the second one (minority class) contains 500 samples.
After training VAE model on the minority class, i generated new samples for this class and add them to the training set, then i trained two classification models, a model on trained on the imbalanced data (only training set) and the second one trained with training set + data generated by VAE). The problem is the first model is giving results better than the second(f1-score, Roc auc...), and i thought that maybe the problem was because of the limited amount of data that the VAE was trained on.
Any help please.

Though 500 training Images are not good enough to generate diversified images from a VAE, you can still try producing some. It's better to take mean of latents of 10 different images (or even more) and pass it through the decoder ( if you're already doing this, ignore it. If you're doing some other method, try this).
If it's still not working, then, I suggest you to build a Conditional VAE on your entire dataset. In conditional VAE, you train VAE using the labels so that your models learns not only reconstruction but also what class of image it is reconstructing. This helps you to generate an Image of any particular class.

Related

Feed image data without class label

I am trying to implement image super resolution using SRGAN. In the process, I used DIV2K dataset (http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip) as my source.
I have worked with image classification using CNN (I used keras.layers.convolutional.Conv2D). But in this case we don't have class label in my data source.
I have unzipped the file and kept in D:\Unzipped\DIV2K_train_HR. Then used following command to read the files.
img_dataset = tensorflow.keras.utils.image_dataset_from_directory("D:\\unzipped")
Then created the model as follows
model = Sequential()
model.add(Conv2D(filters=64,kernel_size=(3,3),activation="relu",input_shape=(256,256,3)))
model.add(AveragePooling2D(pool_size=(2,2)))
model.add(Conv2D(filters=64,kernel_size=(3,3),activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.compile(optimizer='sgd', loss='mse')
model.fit(img_dataset,batch_size=32, epochs=10)
But I am getting error : "Graph execution error". I am unable to find the root cause behind this error. Is this error appearing as the class label is missing (I think as per code DIV2K_train_HR is treated as one class label)? Or is this happening due to images don't have one specific size?
Note: This code does not match with SRGAN architecture. I am new to GAN and trying to move ahead step by step. I got stuck in the first step itself.
Yes, the error message is because you don't have labels in your dataset.
As a first step in GAN network you need to create a discriminator model: given some image it should recognize if it is a real or fake image. You can take images from your dataset and label them as 1 ("real images"). Then generate "fake images" by down-sampling and up-sampling images from your dataset and label them as 0. Train your discriminator model so that it can distinguish between original and processed images.
After that, you create generator model. The generator model takes a down-sampled version of the image as an input and creates an up-sampled version in original resolution. GAN model combines generator and discriminator models by passing output from generator to discriminator. The target label is 1, i.e. we want generator create up-sampled versions of images, which discriminator can't distinguish from the real ones. Now train GAN network (set 'trainable' to false for discriminator model weights).
After your generator manages to produce images, which discriminator can't distinguish from the real, you take them, label as 0 and train discriminator again. Then train generator again etc.
The process continues until discriminator can't distinguish fake images from the real ones anymore (i.e. accuracy doesn't exceed 0.5).
Please see a simple example on ("Generative Adversarial Networks"):
https://github.com/ageron/handson-ml3/blob/main/17_autoencoders_gans_and_diffusion_models.ipynb
This code is explained in ch. 17 in book "Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow (3rd edition)" by Aurélien Géron.

Multiple BERT binary classifications on a single graph to save on inference time

I have five classes and I want to compare four of them against one and the same class. This isn't a One vs Rest classifier, as for each output I want to score them against one base class.
The four outputs should be: base class vs classA, base class vs classB, etc.
I could do this by having multiple binary classification tasks, but that's wasting computation time if the first layers are BERT preprocessing + pretrained BERT layers, and the only differences between the four classifiers are the last few layers of BERT (finetuned ones) and the Dense layer.
So why not merge the graphs for more performance?
My inputs are four different datasets, each annotated with true/false for each class.
As I understand it, I can re-use most of the pipeline (BERT preprocessing and the first layers of BERT), as those have shared weights. I should then be able to train the last few layers of BERT and the Dense layer on top differently depending on the branch of the classifier (maybe using something like keras.switch?).
I have tried many alternative options including multi-class and multi-label classifiers, with actual and generated (eg, machine-annotated) labels in the case of multiple input labels, different activation and loss functions, but none of the results were acceptable to me (none were as good as the four separate models).
Is there a solution for merging the four different models for more performance, or am I stuck with using 4x binary classifiers?
When you train DNN for specific task it will be (in vast majority of cases) be better than the more general model that can handle several task simultaneously. Saying that, based on my experience the properly trained general model produces very similar results to the original binary ones. Anyways, here couple of suggestions for training strategies (assuming your training datasets for each task are completely different):
Weak supervision approach
Train your binary classifiers, and label your datasets using them (i.e. label with binary classifier trained on dataset 2 datasets [1,3,4]). Then train your joint model as multilabel task using all the newly labeled datasets (don't forget to randomize samples before feeding them to trainer ;) ). Here you will need to experiment if you will use threshold and set a label to 0/1 or use the scores of the binary classifiers.
Create custom loss function that will not penalize if no information provided for certain class. So when your will introduce sample from (say) dataset 2, your loss will be calculated only for the 2nd class.
Of course you can apply both simultaneously. For example, if you know that binary classifier produces scores that are polarized (most results are near 0 or 1), you can use weak labels, and automatically label your data with scores. Now during the second stage penalize loss such that for score x' = 4(x-0.5)^2 (note that you get logits from the model, so you will need to apply sigmoid function). This way you will increase contribution of the samples binary classifier is confident about, and reduce that of less certain ones.
As for releasing last layers of BERT, usually unfreezing upper 3-6 layers is enough. Releasing more layers improves results very little and increases time and memory requirements.

Training Tensorflow only one object

Corresponding Tensorflow documentation I trained 3 objects and get result (It can recognize these objects). When I show other objects (not the 3 ones) it doesn't work correctly.
I want to train only one object (example: a cup) and recognize only this object. Is it possible to do via Tensorflow ?
Your question doesn't provide enough details, but as I can guess your trained the network with softmax activation and Categorical or SparseCategorical cross entropy loss. If my guess is right, such network always generates prediction to one of three classess, regardless to actual data, i.e. there is no option of "no-one".
In order to train network to recognize only one class of objects, make the only one output with only one channel and sigmoid activation. Use BinaryCrossEntropy loss to train your model for the specific object. Provide dataset that includes examples with this object and without it.

Limiting probability percentage of irrelevant image in CNN

I am training a cnn model with five classes using keras library. Using model.predict function i get prediction percentage of the classes. My problem is for a image which doesn't belong to these classes and completely irrelevant, the predict class still predicts the percentages according to the classes.
How do I prevent it? How do I identify it as irrelevant?
I assume you are using a softmax activation on your last layer to generate the probabilities for each class. By definition, the sum of the outputs from the softmax activation must add up to 1. Therefore, it is impossible for the neural net to say that the image does not belong to any of your classes, with your current setup.
There are two potential ways you could address this:
Add another class that represents "other" or "unknown" objects (so you have 6 classes).
Add another output to your neural net (or train a completely independent neural net) that does binary classification on whether or not the image is in one of the 5 classes. That way, if your secondary output says that the image is not in the 5 classes, you can ignore the softmax output.
In both cases, you will need to augment your dataset with images that do not fall in your 5 classes.

GAN with not a random input

I'm very interested in GAN those times.
I coded one for MNIST with the following structure :
Generator model
Discriminator model
Gen + Dis model
Generator model generate batches of image from random distribution.
Discrimator is trained over it and real images.
Then Discriminator is freeze in Gen+Dis model and Generator trained. (With the frozen Discriminator who says if the generator is good or not)
Now, imagine I don't want to feed my generator with a random distribution but with images. (For upscaling for example, or generate an real image from a draw)
Do I need to change something in it ?
(Except the conv model who will be more complex)
Should I continue to use the binary_crossentropy as loss function ?
Thanks you very much!
You can indeed put a variational autoencoder (VAE) in front in order to generate the initial distribution z (see paper).
If you are interested in the topic I can recommend the this course at Kadenze.