How to solve object detection problem containing some static objects with no variation? - tensorflow

I am working with a object detection problem, here i am using Region based convolution neural network to detect and segment objects in an image.
I have 8 objects to be segmented(Android App Icons), 6 of them posses variations in background and rest 2 will be under white background(static).
I have already taken 200 variations of each object and trained on MaskRCNN, my model is able to depict the patterns very well in 6 objects with variation.But on rest 2 objects it's struggling even on train set even though it's a exact match.
Q. If i have n objects with variations and m objects with no variation(static), do i need to oversample it? Shall i use any other technique here in this case?
Here in image, icons in black bounding boxes are prominent to change(based on background and there position w.r.t background) but Icon in green bounding box will not have any variations(always be in white background)
I have tried adding more images containing static objects but no luck, Can anyone suggest how to go about in any such problem? I don't want to use sliding window(static image processing) approach here.

Related

Images labeling for object detection when object is larger than the image

how I should label objects to detect them, if the object is larger than the image, e.g. I want to label a building, but in the picture is visible only part of the building (windows and doors, without roof). Or should I remove these pictures from my dataset?
Thank you!
In every object detection dataset I've seen, such objects will just have the label cover whatever is visible, so the bounding box will go up to the border of the image.
It really depends what you want your model to do if it sees an image like this. If you want it to be able to recognise partial buildings, then you should keep them in your dataset and label whatever is visible.
Don't label them. Discard them from your training set. The model needs to learn the difference between the negative class (background) and positive classes (windows, doors). If the positive class takes the whole image, the model will have a massive false positive problem.

TensorFlow find and mark multiple image boundaries

My example is that I have an image with 5 other images on it. Whats the best way to have TensorFlow find/calculate the bounding boxes for each of those... need to take into account that in other images there might only be 3 separate images.
I've found that if I run a cv2.Laplacian on the source image it nicely outlines the 5 individual images but I'm not sure how best to use tensorflow to detect each of those bounding boxes?
UPDATE: My ONE issue is how do I use tensorflow to find each images boundaries? obviously I can find the 4 corners of the whole image but that doesn't help me - I need it to first know how many images their are and then find each of those boundaries.

Creating an ML-Model that detects card-values

This is a more generic question about training an ML-Model to detect cards.
The cards are a kid's game, 4 different colors, numbers and symbols. I don't need to detect the color, just the value (a.k.a symbol) of the cards.
I tried to take pictures with my iPhone of every card, used RectLabel to draw the rectangles around the symbols in the upper left corner (the cards have an upside down-symbol in the lower right corner, too, I didn't mark these as they'll be hidden during detection).
I cropped the images so only the card is visible, no surroundings.
Then I uploaded my images to app.roboflow.ai and let them do their magic (using Auto-Orient, Resize to 416x416, Grayscale, Auto-Adjust Contrast, Rotation, Shear, Blur and Noise).
That gave me another set of images which I used to train my model with CreateML from Apple.
However, when I use that model in my app (I'm using the Breakfast Finder Demo from Apple), the cards values aren't detected - well, sometimes it works, but only at a certain distance from the phone and the labels are either upside down or sideways.
My guess is this is because my images aren't taken the way they should be?
Any hints on how I'd have to set this whole thing up so my model gets trained well?
My bet would be on this being the problem:
I cropped the images so only the card is visible, no surroundings
You want your training images to be as similar as possible to the images your model will see in the wild. If it's trained only on images of cards with no surroundings and then you show it images of cards with things around them it won't know what to do.
This UNO scoring example is extremely similar to your problem and might provide some ideas and guidance.

TensorFlow: Collecting my own training data set & Using that training dataset to find the location of object

I'm trying to collect my own training data set for the image detection (Recognition, yet). Right now, I have 4 classes and 750 images for each. Each images are just regular images of the each classes; however, some of images are blur or contain outside objects such as, different background or other factors (but nothing distinguishable stuff). Using that training data set, image recognition is really bad.
My question is,
1. Does the training image set needs to contain the object in various background/setting/environment (I believe not...)?
2. Lets just say training worked fairly accurately and I want to know the location of the object on the image. I figure there is no way I can find the location just using the image recognition, so if I use the bounding box, how/where in the code can I see the location of the bounding box?
Thank you in advance!
It is difficult to know in advance what features your programm will learn for each class. But then again, if your unseen images will be in the same background, the background will play no role. I would suggest data augmentation in training; randomly color distortion, random flipping, random cropping.
You can't see in the code where the bounding box is. You have to label/annotate them yourself first in your collected data, using a tool as LabelMe for example. Then comes learning the object detector.

extract an object from image using some image processing filter

I am working on an application which something like that I have an image and e.g. there is a glass or a cup or a chair in it. The object can be of any type
My question here is that is there any way that i can apply some image processing filters or something like that which returns me an image that just contain the object and the background is transparent
You can use object detection methods such as
http://opencv.willowgarage.com/documentation/object_detection.html
http://docs.opencv.org/modules/objdetect/doc/latent_svm.html
to detect the object, plot a bounding box around it and extract it from the image.
depends on your application, but you can also use image difference (background subtraction) to get the object...
Actually I have solved the problem
the issue was I do not want to use any advance method that uses template matching or neural networks or anything like that
so in my case the aim was to recognize an object in an image and that object could be anything (e.g. a table,a cellphone, a person, a shirt etc) and the catch was that there could be at most one object in an image
so just using watershed segmentation of opencv I was able to separate the object from the background
but the threshold used for the watershed differs with respect to the frequency of the image and the difference of shades of the object from the background