extract an object from image using some image processing filter - objective-c

I am working on an application which something like that I have an image and e.g. there is a glass or a cup or a chair in it. The object can be of any type
My question here is that is there any way that i can apply some image processing filters or something like that which returns me an image that just contain the object and the background is transparent

You can use object detection methods such as
http://opencv.willowgarage.com/documentation/object_detection.html
http://docs.opencv.org/modules/objdetect/doc/latent_svm.html
to detect the object, plot a bounding box around it and extract it from the image.

depends on your application, but you can also use image difference (background subtraction) to get the object...

Actually I have solved the problem
the issue was I do not want to use any advance method that uses template matching or neural networks or anything like that
so in my case the aim was to recognize an object in an image and that object could be anything (e.g. a table,a cellphone, a person, a shirt etc) and the catch was that there could be at most one object in an image
so just using watershed segmentation of opencv I was able to separate the object from the background
but the threshold used for the watershed differs with respect to the frequency of the image and the difference of shades of the object from the background

Related

Images labeling for object detection when object is larger than the image

how I should label objects to detect them, if the object is larger than the image, e.g. I want to label a building, but in the picture is visible only part of the building (windows and doors, without roof). Or should I remove these pictures from my dataset?
Thank you!
In every object detection dataset I've seen, such objects will just have the label cover whatever is visible, so the bounding box will go up to the border of the image.
It really depends what you want your model to do if it sees an image like this. If you want it to be able to recognise partial buildings, then you should keep them in your dataset and label whatever is visible.
Don't label them. Discard them from your training set. The model needs to learn the difference between the negative class (background) and positive classes (windows, doors). If the positive class takes the whole image, the model will have a massive false positive problem.

How to solve object detection problem containing some static objects with no variation?

I am working with a object detection problem, here i am using Region based convolution neural network to detect and segment objects in an image.
I have 8 objects to be segmented(Android App Icons), 6 of them posses variations in background and rest 2 will be under white background(static).
I have already taken 200 variations of each object and trained on MaskRCNN, my model is able to depict the patterns very well in 6 objects with variation.But on rest 2 objects it's struggling even on train set even though it's a exact match.
Q. If i have n objects with variations and m objects with no variation(static), do i need to oversample it? Shall i use any other technique here in this case?
Here in image, icons in black bounding boxes are prominent to change(based on background and there position w.r.t background) but Icon in green bounding box will not have any variations(always be in white background)
I have tried adding more images containing static objects but no luck, Can anyone suggest how to go about in any such problem? I don't want to use sliding window(static image processing) approach here.

Is it possible to use polygon data annotation to perform tensorflow object detection?

My problem is not exactly annotate data using polygon, circle or line, it's how to use these annotated data to gerenate a ".tfrecord" file and perform an object detection. The tutorials I saw use rectangle annotation, like these: taylor swift detection raccon detection
It would be a great one for me if the objects I want to detect (pipelines) were not too close.
Example of rectangle drawn in PASCAL VOC format:
<bndbox>
<xmin>82</xmin>
<xmax>172</xmax>
<ymin>108</ymin>
<ymax>146</ymax>
</bndbox>
Is there a way to add a "mask" to highlight some part of this bounding box?
If it's something unclear, please let me know.
You can go for instance segmentation instead of object detection if your objects are very close to each other, there you can use polygons to generate masks and bounding boxes to train the model.
Consider this well presented and easy to use repository for mask-rcnn(kind of instance segmentation)
https://github.com/matterport/Mask_RCNN
check this for lite weight mask-rcnn

TensorFlow: Collecting my own training data set & Using that training dataset to find the location of object

I'm trying to collect my own training data set for the image detection (Recognition, yet). Right now, I have 4 classes and 750 images for each. Each images are just regular images of the each classes; however, some of images are blur or contain outside objects such as, different background or other factors (but nothing distinguishable stuff). Using that training data set, image recognition is really bad.
My question is,
1. Does the training image set needs to contain the object in various background/setting/environment (I believe not...)?
2. Lets just say training worked fairly accurately and I want to know the location of the object on the image. I figure there is no way I can find the location just using the image recognition, so if I use the bounding box, how/where in the code can I see the location of the bounding box?
Thank you in advance!
It is difficult to know in advance what features your programm will learn for each class. But then again, if your unseen images will be in the same background, the background will play no role. I would suggest data augmentation in training; randomly color distortion, random flipping, random cropping.
You can't see in the code where the bounding box is. You have to label/annotate them yourself first in your collected data, using a tool as LabelMe for example. Then comes learning the object detector.

How to tap on object from an image and track it from sequence of images using Vision and Core ML framework

I am developing an app using new Core ML framework. What I am trying to achieve is as follows:
1. Select a image and tap on any object from it to draw rectangle
2. After that track that object in multiple images just running in for loop
Currently I am doing with following process
Detect object when user tas and store it VNDetectedObjectObservation = VNDetectedObjectObservation(boundingBox: convertedRect)
Create VNTrackObjectRequest for VNImageRequestHandler to perform the request
But not getting proper result. Any help will be appreciated.
I am not familiar with coreml and objective c, so I can't offer you any code example, but as nobody gives you any answer, I would like to descriebe you the way I would solve this manually:
Get the tapped point and expand a region (of interest), like a N x N square around that point.
Perform a classification on the tapped region, so the algorithm can detect the structure in the consecutive frames.
Store the location in the current frame, then expand that region for the following frame and use this expanded region to detect the object in it.
With this strategy you can use the expanded region from step 3 for an object detection task that you can solve with a YOLO implementation. But it is way faster than putting the whole frame into an object detection, because it only performs the detection on a smalll region.
I hope this helps you at least a bit.