I have trained my custom object detector using faster_rcnn_inception_v2, tested it using object_detection_tutorial.ipynb and it works perfect, I can find bounding boxes for the objects inside the test image, my problem is how can I actually count the number of those bounding boxes or simply I want to count the number of objects detected for each class.
Because of low reputations I can not comment.
As far as I know the object detection API unfortunately has no built-in function for this.
You have to write this function by yourself. I assume you run the eval.py for evaluation!? To access the individual detected objects for each image you have to follow the following chain of scripts:
eval.py -> evaluator.py ->object_detection_evaluation.py -> per_image_evaluation.py
In the last script you can count the detected objects and bounding boxes per image. You just have to save the numbers and sum them up over your entire dataset.
Does this already help you?
I solved this using Tensorflow Object Counting API. We have an example of counting objects in an image using single_image_object_counting.py. I just replaced ssd_mobilenet_v1_coco_2017_11_17 with my own model containing inference graph
input_video = "image.jpg"
detection_graph, category_index = backbone.set_model(MODEL_DIR)
Related
I am trying to use DeepLabV3 for image segmentation and object detection/classification on Coral.
I was able to sucessfully run the semantic_segmentation.py example using DeepLabV3 on the coral, but that only shows an image with an object segmented.
I see that it assigns labels to colors - how do i associate the labels.txt file that I made based off of the label info of the model to these colors? (how do i know which color corresponds to which label).
When I try to run the
engine = DetectionEngine(args.model)
using the deeplab model, I get the error
ValueError: Dectection model should have 4 output tensors!This model
has 1.
I guess this way is the wrong approach?
Thanks!
I believe you have reached out to us regarding the same query. I just wanted to paste the answer here for others to reference:
"The detection model usually have 4 output tensors to specifies the locations, classes, scores, and number and detections. You can read more about it here. In contrary, the segmentation model only have a single output tensor, so if you treat it the same way, you'll most likely segfault trying to access the wrong memory region. If you want to do all three tasks on the same image, my suggestion is to create 3 different engines and feed the image into each. The only problem with this is that each time you switch the model, there will likely be data transfer bottleneck for the model to get loaded onto the TPU. We have here an example on how you can run 2 models on a single TPU, you should be able to modify it to take 3 models."
On the last note, I just saw that you added:
how do i associate the labels.txt file that I made based off of the label info of the model to these colors
I just don't think this is something you can do for segmentation model but maybe I'm just confused on your query?
Take object detection model for example, there are 4 output tensors, the second tensor gives you an array of id associates with a certain class that you can map to a a label file. Segmentaion models only give the pixel surrounding an objects.
[EDIT]
Apology, looks like I'm the one confused on segmentation models.
Quote form my college :)
"You are interested to know the name of the label, you can find the corresponding integer to that label from result array in Semantic_segmentation.py. Where result is classification data of each pixel.
For example;
if you print result array in the with bird.jpg as input you would find few pixel's value as 3 which is corresponding 4th label in pascal_voc_segmentation_labels.txt (as indexing starts at 0 )."
When training a single class object detector in Tensorflow, I am trying to pass instances of images where no signal object exists, such that the model doesn't learn that every image contains at least one instance of that class. E.g. if my signal were cats, id want to pass pictures of other animals/landscapes as background -this could also reduce false positives.
I can see that a class id is reserved in the object detection API (0) for background, but I am unsure how to code this into the TFrecords for my background images - class could be 0 but what would be the bounding box coords? Or do i need a simpler classifier on top of this model to detect if there is a signal in the image or not, prior to detecting position?
Later approach of simple classifier, makes sense. I don't think there is a way to do the first part. You can use check on confidence score as well apart from checking the object is present.
It is good practice to create a dataset with not objects of interest, for the same you need to use the same tools (like - label img) that you have used for adding the boxes, image with no BB wil have xml files with no details of BB but only details of the image. The script create tf record will create the tf record from the xml files, look at the below links for more inforamtion -
Create tf record example -
https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_pet_tf_record.py
Using your own dataset-
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md
I am using this code to train a word2vec model. I am trying to train it incrementally, with using saver.restore(). I am using new data after restoring the model. Since vocabulary size for the old data and new data are not the same, I got an exception like this:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [28908,200] rhs shape= [71291,200]
Here 71291 is vocabulary size for the old data and 28908 is for new data.
It gets the vocabulary words from the train_data file here, and constructs the network model using size of the vocabulary. I thought that if I could set vocabulary size the same for my old data and new data, I can solve this problem.
So, my question is: Can I do that in this code? As far as I understand, I cannot reach skipgram_word2vec() function.
Or, is there any other way of solving this issue in this code beside what I thought? If it is not possible using this code, I will try other ways for my purpose.
Any help is appreciated.
Having taken a look at the source of word2vec_optimized.py I'd say you will need to change the code there. It operates by opening a text file right up front as "training data". For your purposes, you have to change the build_graph method and allow it to get an option to set all that data ( words, counts, words_per_epoch, current_epoch, total_words_processed, examples, labels, opts.vocab_words, opts.vocab_counts, opts.words_per_epoch ) when initializing, and not from a text file.
Then you need to merge the two text files, and load them once, to produce the vocabulary. Then save all the data above, and use that to restore the network at each subsequent run.
If you use more than 2 texts, you need to include all the text you plan to use in the first data to produce the vocabulary, however.
im getting started with tensorflow und using retrain.py to teach it some new categories - this works well - however i have some questions:
In the comments of retrain.py it says:
"This produces a new model file that can be loaded and run by any TensorFlow
program, for example the label_image sample code"
however I havent found where this new model file is saved to ?
also: it does contain the whole model, right ? not just the retrained part ?
Thanks for clearing this up
1)I think you may want to save the new model.
When you want to save a model after some process, you can use
saver.save(sess, 'directory/model-name', *optional-arg).
Check out https://www.tensorflow.org/api_docs/python/tf/train/Saver
If you change model-name by epoch or any measure you would like to use, you can save the new model(otherwise, it may overlap with previous models saved).
You can find the model saved by searching 'checkpoint', '.index', '.meta'.
2)Saving the whole model or just part of it?
It's the part you need to learn bunch of ideas on tf.session and savers. You can save either the whole or just part, it's up to you. Again, start from the above link. The moral is that you put the variables you would like to save in a list quoted as 'var_list' in the link, and you can save only for them. When you call them back, you now also need to specify which variables in your model correspond to the variables in the loaded variables.
While running retrain.py you can give --output_graph and --output_labels parameters which specify the location to save graph (default is /tmp/output_graph.pb) and the labels as well. You can change those as per your requirements.
I want to find multiple objects in a scene (objects look the same, but may differ in scale, and rotation and I don´t know what the object to be detected will be). I have implemented the following idea, based on the featuredetectors in OpenCV, which works:
detect and compute keypoints from the object
for i < max_objects_todetect; i++
1. detect and compute keypoints from the whole scene
2. match scene and object keypoints with Flannmatcher
3. use findHomography/Ransac to compute the boundingbox of the first object (object which hast the most keypoints in the scene with multiple objects)
4. set the pixel in the scene, which are within the computed boundingbox to 0, -> in the next loopcycle there are no keypoints for this object to detect anymore.
The Problem with this implementation is that I need to compute the keypoints for the scene multiple times which needs alot of computing time (250ms). Does anyone has a better idea for detecting multiple objects?
Thanks Drian
Hello togehter I tried ORB which is indeed faster and I will try Akaze.
While testing ORB I have encouterd following problem:
While chaning the size of my picture doesn´t affect the detected keypoints in Surf (finds the same keypoints in the small and in the big picture (in the linked picture right)), it affects the keypoints detected by ORB. In the small picute i´m not able to find these keypoints. I tried to experiment with the ORB parameters but couldn´t make it work.
Picture: http://www.fotos-hochladen.net/view/bildermaf6d3zt.png
SURF:
cv::Ptr<cv::xfeatures2d::SURF> detector = cv::xfeatures2d::SURF::create(100);
ORB:
cv::Ptr<cv::ORB> detector = cv::ORB::create( 1500, 1.05f,16, 31, 0, 2, ORB::HARRIS_SCORE, 2, 10);
Do you know if, and how it´s possible to detect the same keypoints independent of the size of the pictures?
Greetings Drian