ML Kit allow you to detect & track objects in image/video frame, but is this possible that user choose a random object from the frame and ML Kit detect/track this object in image/subsequent video frames?
Current API allows only "detected" objects to be tracked.
Related
I am a beginner to machine learning , but I've been using yolov5 and yolov7 to train and detect objects from images. Trainings was done by annotating the objects using various annotation tools such as roboflow etc. I now wanted to enter the domain of lidar and point cloud data. I was wondering if there was such a way to annotate the objects from lidar, and then train and save objects for future detection of objects from point cloud data. Is there anyway this can be done using such algorithms or any other ways?
I have been working on a truck detection model using YOLOv4 darknet. The model performance is good. I have applied deep_sort object tracking to track the activity of the trucks in that region.
The problem with this approach is that the truck identity changes when it steers around showing a different side against the camera feed or is obstructed by another object.
Is there a way to make sure that truck ID does not change?
Link to a demo inference video
I have trained the model specifically for this video. Object detection works fine but tracking id changes.
I am using Microsoft custom vision service in object detection to extract the wanted objects. And I would like to make a regression test to compare the results. However, I cannot find a place to export the training picture with the bounding box that user defined by GUI.
The model training is done within the custom vision platform provided by Microsoft (https://www.customvision.ai/). Within this platform we can add the images and then tag the objects. I have tried to export the model, but I am not sure where to find the info of training pictures along with their tag(s) and bounding box(es).
I expect that in this platform, user can export the not only the trained model but also the training data (images with tags and bounding boxes.) But I was not able to find them.
All the data that you are looking for is available through Custom Vision Training API. Currently the latest API is v3.0, its portal is here.
More in details, GetTaggedImages method will give you the associations of images and regions bounding box
Sample result of this method with one of my demos:
With these details, you will be able to get the image and place the boundingBox that was used for training.
Please see the following link for export your model. Custom Vision Service exports compact domains. The models generated by compact domains are optimized for the constraints of real-time classification on mobile devices. If the user wants to export the user training data from the custom vision please see the following link.
I am using the tensorflow object detection api on windows system and it has been working fine. I am using the pre-trained model as of now which is ssd_mobilenet_v1_coco_11_06_2017. It is easily detecting all the objects in a given input video. I wanted to record time of each object so lets say, if in a video a car enters at 00:05 and leaves at 00:15 which means that it was in for 10secs.
To achieve this, I was looking if there is anything like id of each detected object which the API returns so that I can start a timer from the code to calculate the time of an object. Is there any already built functionality for this in the API.?
Tensorflow Object detection does not provide such functionality, but you can user KFC algorithm(easily available using Open CV) to track the object.
https://www.docs.opencv.org/3.4.1/d2/dff/classcv_1_1TrackerKCF.html
or You can implement SORT above object detection API which uses Kalman Filter but easy to integrate.
https://github.com/abewley/sort/blob/master/sort.py
The Tensorflow Object Detection API does not currently track objects between frames.
Can I use google's vision API to not only detect faces on a specific picture but to detect which person is in the picture ?
Can this be done for celebrities (or ppl which can be easily find via a google search) automatically ? For unfamiliar ppl via some learning/look-alike mechanism ?
Thanks.
No. From the Google Vision API description:
Face Detection
Detect multiple faces within an image, along with the associated key facial attributes like emotional state or wearing headwear. Facial Recognition is not supported.
But, you can implement facial recognition yourself using OpenCV. I don't know your preferred language, but here is a tutorial on how to implement facial recognition in Python. OpenCV also has interfaces for C++ and Java.