Number of images in each classes - tensorflow

Currently, I am training a model for image detection, and I want to know how many images do I need per class, do i need to have the same numbers of each object.
Please i need some advice.
I use Tensorflow, and Yolo v2 model.
Thanks,

You need as many as you can get, but definitely in the order of tens of thousands at least if you're training the network from scratch (there are pre-trained weights for YOLOv2 trained on
- http://host.robots.ox.ac.uk/pascal/VOC/).
It's best to have balanced classes, meaning the number of images for each class must be close, it's easier to train this way.
Why are you training the network yourself? Can't you use some pre-trained models, drop the FC layers and insert your own classes? This way it's much faster, and you don't need that many images.

Related

Image Detector with tensorflow

I want to build a simple image detector for custom Binary shapes on images.
I may train and use the models on object detection zoo such as ssd_inception_v2 and so on. But it's would be extremely un efficient as it has sizes in hundreds of Megabytes.
and I can't even imagine to use that in my simple app. can anybody suggest me how to solve this?
I have already built excellent small size classifiers for my images. but can't build small scale efficient detector. (their position with detection boxes)
I think what you need is transfer learning. I would take one of the lightweight models such as MobileNetV2 and retrain on my dataset. It should be pretty quick.If you want to even decrease your model size further, feel free to only take the first few layers of the CNN and retrain it. It would be a bit more work since you need to re-write the part of network you want to use and load it with the pre-trained weights.

What is the purpose of a pre-trained network in Faster R-CNN?

I am not able to understand the purpose of a pre-trained network. From what I read, it is used for the RPN and the Classification Network. But I dont't understand how.
CNNs take a notoriously long time to train, especially for more complex models with higher resolutions. In order to avoid the days of training on a high-end GPU, pre-trained models have been made available. You then just have to train on your specific data (assuming your data is similar to the pre-trained data). For instance, if you want to train a CNN to recognize cats in high resolution images, you might want to start with a pre-trained model that recognizes dogs. The training should take a lot, lot less time due to the fact that a lot of the same underlying patterns have already been learned and all your training needs to do is differentiate cats from dogs.

Is it possible to train a CNN on a dataset and test it on another dataset with different classes?

I am new to deep learning, and I am doing a research using CNNs. I need to train a CNN model on a dataset of images (landmark images) and test the same model using a different dataset (landmark images too). One of the motivations is to see the ability of the model to generalize. But the problems is: Since the dataset used for train and test is not the same, the classes are not the same! Possibly, the number of classes too, which means that the predictions made on the test dataset are not trust worthy (Since the weights of the output layer have been calculated based on different classes belonging to train dataset). Is there any way to evaluate a model on a different dataset without affecting test accuracy?
The performance of a neural network on one dataset will not generally be the same as its performance on another. Images in one dataset can be more difficult to distinguish than those in another. As a rule of thumb: if your landmark datasets are similar, it's likely that performance will be similar. However, this is not always the case: subtle differences between the datasets can result in significantly different performance.
You can account for the potentially different performance on the two datasets by training another network on the other dataset. This will give you a baseline of what to expect when you try to generalize your network to it.
You can apply your neural network trained for one set of classes to another set of classes. There are two main approaches to this:
Transfer learning. This is where the last layer of your trained network is replaced with a new layer(s) that is trained, by itself, to classify the new images. (Use for many classes. Can use for few classes.)
All-Transfer learning. Rather than replacing the last layer, add a new layer after it and only train the final layers. (Use for few classes.)
Both approaches are much quicker than training a neural network from scratch.
I assume that you are facing a classification problem.
What do you explicitly mean? Do you have classes A B and C in your train-dataset and the same classes in your test-dataset with a different labeling, or do you have completly different classes in your test-dataset with respect to your train-dataset?
You can solve the first problem by creating a mapping from trainlabel to testlabel or vice versa.
The second one depends on what you are trying to achieve... If you want the model to predict classes, which were never trained, you wont get any outcome.

Is there a way to recognise an object in an image?

I am looking for some pre-trained deep learning model which can recognise an object in an image. Usually the images are of type used in shopping websites for products. I want to recognise what is the product in the image. I have come across some pre-trained models like VGG, Inception but they seems to be trained on some few general objects like 1000 objects. I am looking for something which is trained on more like 10000 or more.
I think the best way to do this is to build your own training set with the labels that you need to predict, then take an existing pre-trained model like VGG, remove the last fully connected layers and train the mode with your data, the process called transfer learning. Some more info here.

Image Classification: Heavily unbalanced data over thousands of classes

I have a dataset consist of around 5000 categories of images, but the number of images of every category varies from 20 to 2000, which is quite unbalanced. Also, the number of images are far from enough to train a model from scratch. I decided to do finetuning on pretrained models, like Inception models.
But I am not sure about how to deal with unbalanced data. There are several possible approaches:
Oversampling: Oversample the minority category. But even with aggressive image augmentation technique, we may not be able to deal with overfit.
Also, how to generate balanced batches from unbalanced dataset over so many categories? Do you have some ideas about this pipeline mechanism with TensorFlow?
SMOTE: I think it is not so effective for high dimensional signals like images.
Put weight on cross entropy loss in every batch. This might be useful for single batch, but cannot deal with the overall unbalance.
Any ideas about this? Any feedback will be appreciated.
Use tf.losses.softmax_cross_entropy and set weights for each class inversely proportional to their training frequency to "balance" the optimization.
Start with the pre-trained ImageNet layers, add your own final layers (with appropriate convolution, drop out and flatten layers as required). Freeze all but last few of the ImageNet layers, then train on your dataset.
For unbalanced data (and in general small datasets), use data augmentation to create more training images. Keras has this functionality built-in: Building powerful image classification models using very little data