Weakly labelled dataset for object detection - tensorflow

I have a few datasets that are well labeled for each class they represent. I'm trying to build an object detection model using the tensorflow/research/object_detection pipeline to detect each object.
However... each dataset is not labelled for the other classes. I'm concerned that mined examples will be labelled as the background class when they really represent a class in the other datasets.
For example, if I'm trying to make a fruit detector and I have a dataset labelled with apples, another labelled with oranges and another labelled with bananas - how would I go about weighting the classification loss so it ignores the apple predictions on the orange examples and vice versa?

If i understand your post correctly, each image has multiple class objects in it. If thats the case then just create multiple bounding boxes for each of the classes in that image. An image can have multiple bounding boxes drawn with labelled class name that can be used for training

Related

Is it possible to combine two different custom YOLOv4 models

I'm working on an object detection project where I have to identify the type of animals and their posture given an image/video. For this purpose, I have two custom YOLOv4 models which are trained separately. Model 1 identifies the type of animal and Model 2 identifies the posture of the animal. I have converted these models to TensorFlow models.
Now, since both the models use the same image/video as input, I want to combine the outputs of both the models and the final output should display the bounding box of both the models.
I'm stuck at this point, I have been researching the solution for this and I'm confused with various methods. Could anyone help me with this?
I don't think that you need object detection model as pose identifier - because you've already localized the animal by 1st net.
The easiest (and clearly not very accurate) solution that I see is to use classifier on top of detections (crop bounding box as input) - but in that case the animal anatomy is not taken into account explicitly, but that approach is I guess still good baseline.
For further experiments you can take a look at these and these solutions with animal pose estimation, but they are more complex to use

Object Detection without labels / annotation

Let say I have 3 images (an apple, an orange, a banana) and another 1000 arbitrary images. What I want to do is to see if those 1000 arbitrary images contain object(s) similar to the former 3 images, if yes, draw a bounding box to indicate those objects. However, none of these 1003 images or objects are labelled nor have any annotations.
I have do some research on the internet and try to find some deep learning object detection approach (e.g. Faster R-CNN, YOLOv3) but I couldn't think of how they can be related to my task.
I have also notice that there is a term called template matching, but it seems not much related to deep learning.
So my question is:
Is there any good approach or deep learning model that could meet my needs?
Will I be benefit from any pre-trained Faster R-CNN, YOLOv3 models? (e.g. If they are trained by cars, people, dogs, cats image set, will those meaningful features can also apply to new domain?)
I want to do is to see if those 1000 arbitrary images contain object(s) similar to the former 3 image
What did you mean by "similar?"
If you meant "I want to see if the 1000 images contain objects from the target classes: orange, apple, and banana", then here's the answer:
If your models were pre-trained with your target classes (orange,
apple, and banana), then you can use those pre-trained models to
detect the objects in your 1003 images. You can just select orange,
apple, and banana as the classes' names in the configuration.
If your pre-trained models weren't trained on your target classes and you only have your 1003 images, you will need to do what is called fine-tuning, which is training the last layer of the model. 1003 images might not be enough for training the model and you might need to perform data augmentation to expand your data. Also, consider making your classes balanced (meaning having the same number of objects per class).
For something close to "similarity score," you can consider the
confidence score for class x, which is the likelihood the bounding box contains an object x. However, this confidence score mainly depends on "how well trained" the model is on class x. For example, different models may differ in their confidence scores for the same images. Also, the same model may have different confidence scores for the same object in different angles, lighting, and orientation. Thus, it might be a better idea for you to fine-tune the models anyway so that they can be more "robust" to any representations of your target classes.

Retrain TF object detection API to detect a specific car model -- How to prepare the training data?

I am new to object detection and trying to retrain object-detection API in TensorFlow to detect a specific car model in photos. When preparing my own training data to retrain the model, besides things like drawing bounding boxes, etc, my question is, should I also prepare negative examples in the training data (cars that are not the model I am interested in) to reach good performance?
I have read through some tutorials and they usually give example in detecting one type of object, and they prepared training data with the label only for that type. I was thinking, since the model first proposal some area of interest, then try to classify those areas, should I also prepare negative examples if I want to detect very specific stuff from photos.
I am retaining faster_rcnn based model. Thanks for the help.
Yes, you will need negative examples also for better performance. Seems like are you thinking about using transfer learning to train a pre-trained faster_rcnn model to add a new class for your custom car. You should start an equal number of positive and negative examples (images with labelled bounding boxes). You will need have examples of several negative classes (e.g. negative car type 1, negative car type 2, negative car type 3) in addition to your target car type.
You can look at examples of one positive class and several negative classes training data for transfer learning in the data folder of the my github repo at: PSV Detector Github

Using Tensorflow object detection API to detect objects and classify objects by color

I am able to use Tensorflow to train the model on my own dataset. For example, I have trained a model to only detect the safety helmet and the result is good.
My plan for next step is to classify the identified safety helmet by colors. But I still in search of methods.
I am wondering should I retrain the model with different label map like: [item1 red_helmet] [item2 blue_helmet] and label my training dataset respectively? Or is there any other tricky way to achieve the same outcome?
You already have the region of interest in the picture.
All you need is extract the helmets from the picture and pass the cropped images to openCV routine that can detect colours.
Thats it, you are done :)

How to create a class for non classified object in tensorflow?

Hi i have build my CNN with two classes dogs and cats, i have trained this and now i am able to classify dog and cat image. But what about if i want to introduce a class for new unclassified object? For example if i feed my network with a flower image's the network give me a wrong classification. I want to build my network with a third class for new unclassified object. But how can i build this third class. Which images i have to use to get class for new object that are different from dogs or cats?
Actually at the end of my network i use Softmax and my code is developed by using tensorflow. Could someone provide me some suggestion? Thanks
You need to add a third "something else" class to your network. There are several ways you can go about it. In general, if you have a class that you want to detect you should have examples for that class, so you could add images without cats or dogs to your training data labelled with the new class. However, this is a bit tricky, because the new class is, by definition, everything in the universe but dogs and cats, so you cannot possibly expect to have enough data to train for it. In practice, though, if you have enough examples the network will probably learn that the third class is triggered whenever the first two are not.
Another option that I have used in the past is to model the "default" class slightly different from the regular ones. So, instead of trying to actually learn what is a "not cat or dog" image, you can just explicitly say that it is just whatever does not activates the cat or dog neurons. I did this by replacing the last layer from softmax to a sigmoids (so the loss would be sigmoid cross-entropy instead of softmax cross-entropy, and the output would not be a categorical probability distribution anymore, but honestly it didn't make much difference performance-wise in my case), then express the "default" class as 1 minus the maximum activation value from every other class. So, if no class had an activation of 0.5 of greater (i.e. 50% estimated probability of being that class), the "default" class would be the highest scoring one. You can explore this an other similar schemes.
You should just add images to your dataset that are neither dogs nor cats, label them as "Other", and treat "Other" as normal class in all your code. In particular you'll get a softmax over 3 classes.
The images you're using can be anything (except cats and dogs of course), but should be of the same kind as the ones you'll probably be testing against when using your network. So for instance if you know you'll be testing on images of dogs, cats, and other animals, train with other animals, not with pictures of flowers. If you don't know what you'll be testing with, try to get very varied images, from different sources, etc, so that the network learns well that this class is "anything but cats and dogs" (the wide range of images in the real world that fall in this category should be reflected in your training dataset).