Hello I'm new to the object detection tensorflow area. I tagged my images with the labelimg program and then trained them, but among the results I got, I got multiple detections on an object. What can I do to prevent this?enter image description here
It is normal you get multiple detections with different score. Post process your result: use some score threshold, merge your detections if they are at the same position. Check out this guide
Related
I want to train some neural network to detect symbols on a car license plate.
I got 10k pictures with plates, and 10k strings, that contains text, represented on plates. For example, this picture, has name:"В394ТТ64.png" (others pictures has +- same quality and size, but different shadows\contrast\light and stuff).
So, what do i want to do?
I want to automatically create PASCAL VOC xml files, containing information about each symbol on a plate. Then I want to train neural network to detect symbols and their classes. I already know which symbols appear on picture, but I don't know how to get bounding box coordinates.
I tried to use OpenCV and binary segmentation, but lightning, shadows, size and noise on pictures are too various.
Also, I tried to find trained neural networks, that can detect symbols, or train one by myself, but failed.
So, how can I get bounding box for each symbol on a license plate?
there are multiple methods to do this.
Manly, you will have to go over your image and do object detection on each segment of the image.
In Your case that should be more easy as it is already a defined area. Probably move from left to right in strides.
Using an MNIST trained classifier, you can classify the number on the image part. If you get a result with p of e.g., 90% you get the coordinates from that part of the image as your boundingbox coordiantes.
You can of course reuse known architectures such as R-CNN or Yolo
Here you can find a nice overview.
Good luck
Found another way to solve this problem.
I wrote a script, that generates different images with number plates and xml files for each image. I generated 10k images.
Then i augmented them so they look more like "real world" images. Now i have 14k images. 4 from original set, and 10k augmented.
Trained ssd_mobilenet model.
After, i used autoannotation to detect boxes on real images
Trained model one more time, and that's it.
I am able to use Tensorflow to train the model on my own dataset. For example, I have trained a model to only detect the safety helmet and the result is good.
My plan for next step is to classify the identified safety helmet by colors. But I still in search of methods.
I am wondering should I retrain the model with different label map like: [item1 red_helmet] [item2 blue_helmet] and label my training dataset respectively? Or is there any other tricky way to achieve the same outcome?
You already have the region of interest in the picture.
All you need is extract the helmets from the picture and pass the cropped images to openCV routine that can detect colours.
Thats it, you are done :)
I have encountered a problem for mnist dataset on tensorflow. As you probably know, using batches it does not preserve order on datasets but I need to know exactly which image of the samples I am working on. Does TF have any kind of indicator such as ID or some information that tells you which images it has extracted? For instance in one batch we may get images 20,1,4,6 and in another we get 3,7,88 etc from mnist. I want to have access to these IDs, is this possible?
You can always add your own indication; when you enqueue the features and labels you can enqueue the indicator as well.
I was working with the recently released Tensorflow's API for object detection, with Faster RCNN on Resnet 101 on my own dataset. It seems to train and evaluate on Validation data, but I was hoping if there was a way I could get/store bounding boxes for all images in the Eval set, in a file, or maybe, get the location in the source code where I can get the predicted bounding boxes with image names.
If you just want to obtain the detected bounding boxes given a set of images, the Jupyter notebook contains a good example of how to do this.
We have been using Tensorflow for image classification, and we all see the results for the Admiral Grace Hopper, and we get:
military uniform (866): 0.647296
suit (794): 0.0477196
academic gown (896): 0.0232411
bow tie (817): 0.0157356
bolo tie (940): 0.0145024
I was wondering if there is any way to get the coordinates for each category within the image.
Tensorflow doesn't have sample code yet on image detection and localization but it's an open research problem with different approaches to do it using deep nets; for example you can lookup the papers on algorithms called OverFeat and YOLO (You Only Look Once).
Also, usually there's some preprocessing on the object coordinates labels, or postprocessing to suppress duplicate detections. Usually a second, different network is used to classify the object once it's detected.