How to apply object tracking on an object that shows different sides? - object-detection

I have been working on a truck detection model using YOLOv4 darknet. The model performance is good. I have applied deep_sort object tracking to track the activity of the trucks in that region.
The problem with this approach is that the truck identity changes when it steers around showing a different side against the camera feed or is obstructed by another object.
Is there a way to make sure that truck ID does not change?
Link to a demo inference video
I have trained the model specifically for this video. Object detection works fine but tracking id changes.

Related

Multi-label image classification vs. object detection

For my next TF2-based computer vision project I need to classify images to a pre-defined set of classes. However, multiple objects of different classes can occur on one such image. That sounds like an object detection task, so I guess I could go for that.
But: I don't need to know where on an image each of these objects are, I just need to know which classes of objects are visible on an image.
Now I am thinking which route I should take. I am in particular interested in a high accuracy/quality of the solution. So I would prefer the approach that leads to better results. Thus from your experience, should I still go for an object detector, even though I don't need to know the location of the detected objects on the image, or should I rather build an image classifier, which could output all the classes that are located on an image? Is this even an option, can a "normal" classifier output multiple classes?
Since you don't need the object localization, stick to classification only.
Although you will be tempted to use the standard off-the-shelf network of multi-class multi-label object detection because of its re-usability, but realize that you are asking the model to do more things. If you have tons of data - not a problem. Or if your objects are similar to the ones used in ImageNet/COCO etc, you can simply use standard off-the-shelf object detection architecture and fine-tune on your dataset.
However, if you have less data and you need to train from scratch (e.g. medical images, weird objects), then object detection will be an overkill and will give you inferior results.
Remember, most of the object detection networks re-cycle the classification architectures with modifications added to last layers to incorporate additional outputs for object detection coordinates. There is a loss function associated with those additional outputs. During training in order to get best loss value, some of the classification accuracy is compromised for the sake of getting better object localization coordinates. You don't need that compromise. So, you can modify the last layer of object detection network and remove the outputs for coordinates.
Again, all this hassle is worth only if you have less data and you really need to train from scratch.

Continue training CoreML Model

I'm trying to get a better understanding on how to create object detection models in Turi Create (for usage in CoreML). I'm trying to create a model that detects custom images I designed and printed myself. To avoid having to take a huge amount of photo's, I'm figured I'd use the one-shot-object-detection feature provided by Turi Create. So far so good. I feed the algorithm two starter images and it successfully generates the synthetic data set and creates a somewhat reliable model.
Now I'm wondering what happens when I want to add a third category. I could of course add a third starter image and run the code again, but this feels like 2/3th of the work is redundant...
Is there a way to continue training a previously trained model, or combine multiple models so I don't have to retrain my models from scratch every time I add a category? If not, any other ways to get this done (e.g. TensorFlow)?
Turi Create is rather limited in the options it offers for retraining (none, basically). If you want more control over the process, using a tool such as TensorFlow is the better choice.

Tensorflow: Does object detection API returns detected object id

I am using the tensorflow object detection api on windows system and it has been working fine. I am using the pre-trained model as of now which is ssd_mobilenet_v1_coco_11_06_2017. It is easily detecting all the objects in a given input video. I wanted to record time of each object so lets say, if in a video a car enters at 00:05 and leaves at 00:15 which means that it was in for 10secs.
To achieve this, I was looking if there is anything like id of each detected object which the API returns so that I can start a timer from the code to calculate the time of an object. Is there any already built functionality for this in the API.?
Tensorflow Object detection does not provide such functionality, but you can user KFC algorithm(easily available using Open CV) to track the object.
https://www.docs.opencv.org/3.4.1/d2/dff/classcv_1_1TrackerKCF.html
or You can implement SORT above object detection API which uses Kalman Filter but easy to integrate.
https://github.com/abewley/sort/blob/master/sort.py
The Tensorflow Object Detection API does not currently track objects between frames.

Retrain TF object detection API to detect a specific car model -- How to prepare the training data?

I am new to object detection and trying to retrain object-detection API in TensorFlow to detect a specific car model in photos. When preparing my own training data to retrain the model, besides things like drawing bounding boxes, etc, my question is, should I also prepare negative examples in the training data (cars that are not the model I am interested in) to reach good performance?
I have read through some tutorials and they usually give example in detecting one type of object, and they prepared training data with the label only for that type. I was thinking, since the model first proposal some area of interest, then try to classify those areas, should I also prepare negative examples if I want to detect very specific stuff from photos.
I am retaining faster_rcnn based model. Thanks for the help.
Yes, you will need negative examples also for better performance. Seems like are you thinking about using transfer learning to train a pre-trained faster_rcnn model to add a new class for your custom car. You should start an equal number of positive and negative examples (images with labelled bounding boxes). You will need have examples of several negative classes (e.g. negative car type 1, negative car type 2, negative car type 3) in addition to your target car type.
You can look at examples of one positive class and several negative classes training data for transfer learning in the data folder of the my github repo at: PSV Detector Github

Tensorflow object detection API not detecting all objects

I am attempting to use the tensorflow object detection API. To check things out I have made use of a pretrained model, and attempted to run it on a image that I created.
But I see that the API does not detect all the objects in the image (though they are the same image of the dog).I used ssd_mobilenet_v1_coco pretrained model
I have attached the final output image with the detected objects.
Output image with the detected objects
Any pointers on why that might be happening? Where should I be start looking into to improve this?
Tensorflow Object Detection API comes with 5 pre-trained models each with a trade off on speed or accuracy. Single Shot Detectors (ssd) are designed for speed, not accuracy and why it's a preferred model for mobile devices or real-time video detection.
Running your image of 5 dogs through an R-FCN model rfcn_resnet101_coco_11_06_2017, designed for greater accuracy over speed, it detects all 5 dogs. However, this model isn't designed for real-time detection as it'll struggle to push through a respectable fps at best.