Custom keypoints tracking using tensorflow - tensorflow

What is the best way to train TensorFlow for custom keypoint tracking that can work on the web?
Right now I'm using CenterNet MobileNetV2 FPN Keypoints 512x512 to train, but the outcome is not good enough keypoints confidence is significantly less approx 30%, but the bounding box is fine. So is there any way I can improve the model confidence for keypoints?
Config which im using:
steps 25000
epoch 12
learning rate 0.01
train dataset 1280
test dataset 319
Im trying to train a ml model using tensorflow that can track custom kwypoints but my tain model is not working as im expecting

Related

Yolov4 Darknet | Continue training with a ready-made weight

I have a weight file already trained. I want to add more images to the dataset and improve recognition. Is it possible to continue training with a ready-made weight, if so, how or will I have to train again?

Small batch size using Object Detection API

I've developed a custom model using tf object detection api for human keypoint estimation.
Architecture is MobilenetV3 + FPN + Centernet. In the model zoo I saw there is an example using MobilenetV2 as feature extractor instead, and the pipeline.config there seems to be using batch size 512. I'm training on an Nvidia A100 80GB GPU, and it can only fit a batch size of 32. I've tried with only powers of 2 batch sizes because it makes adapting the training steps number easy.
This would suggest that I might need 16 such GPUs to train the model with the suggested 512 batch size. Are needed resources for training such a model expected to be this high?

Is it possible to bias the training of an object detection model towards classification in tensorflow ModelMaker?

I'm using Tensorflow 2 Model Maker to perform transfer training of EfficientDet-Lite (ultimately to run on a Coral EdgeTPU) and I care much more about the classification output and much less about the precision of the bounding boxes. Is there a way to modify some training parameters to improve the accuracy of the classes at the expense of the accuracy of the bounding boxes? Or does this not make sense?
Unfortunately, TensorFlow 2 Model Maker doesn't support such customization at this moment.
If you want to do so, you can bypass Model Maker and directly use AutoML repo. The technical detail is to adjust weights for different losses by adding loss_weights in compile() function.

Tensorflow object detection: why is the location in image affecting detection accuracy when using ssd mobilnet v1?

I'm training a model to detect meteors within a picture of the night sky and I have a fairly small dataset with about 85 images and each image is annotated with a bounding box. I'm using the transfer learning technique starting with the ssd_mobilenet_v1_coco_11_06_2017 checkpoint and Tensorflow 1.4. I'm resizing images to 600x600pixels during training. I'm using data augmentation in the pipeline configuration to randomly flip the images horizontally, vertically and rotate 90 deg. After 5000 steps, the model converges to a loss of about 0.3 and will detect meteors but it seems to matter where in the image the meteor is located. Do I have to train the model by giving examples of every possible location? I've attached a sample of a detection run where I tiled a meteor over the entire image and received various levels of detection (filtered to 50%). How can I improve this?detected meteors in image example
It could very well be your data and I think you are making a prudent move by improving the heterogeneity of your dataset, BUT it could also be your choice of model.
It is worth noting that ssd_mobilenet_v1_coco has the lowest COCO mAP relative to the other models in the TensorFlow Object Detection API model zoo. You aren't trying to detect a COCO object, but the mAP numbers are a reasonable aproximation for generic model accuracy.
At the highest possible level, the choice of model is largely a tradeoff between speed/accuracy. The model you chose, ssd_mobilenet_v1_coco, favors speed over accuracy. Consequently, I would reccomend you try one of the Faster RCNN models (e.g., faster_rcnn_inception_v2_coco) before you spend a signifigant amount of time preprocessing images.

Training Resnet deep neural network from scratch

I need to gain some knowledge about deep neural networks.
For a 'ResNet' very deep neural network, we can use transfer learning to train a model.
But Resnet has been trained over the ImageNet dataset. So their pre-trained weights can be used to train the model with another dataset. (for an example training a model for lung cancer detection with CT lung images)
I feels that this approach will be not accurate as the pre-trained weights has been completely trained over other objects but not with medical data.
Instead of transfer learning, is it possible to train the resnet from scratch? (but the available number of images to train the resnet is around 1500) . Is it something possible to do with a normal computer.
Can someone please share your valuable ideas with me
is it possible to train the resnet from scratch?
Yes, it is possible, but the amount of time one needs to get to good accuracy greatly depends on the data. For instance, training original ResNet-50 on a NVIDIA M40 GPU took 14 days (10^18 single precision ops). The most expensive operation in CNN is the convolution in the early layers.
ImageNet contains 14m 226x226x3 images. Since your dataset is ~10000x smaller, each epoch will take ~10000x less ops. On top of that, if you pass gray-scale instead of RGB images, the first convolution will take 3x less ops. Likewise spatial image size affects the training time as well. Training on smaller images can also increase the batch size, which usually speeds things up due to vectorization.
All in all, I estimate that a machine with a single consumer GPU, such as 1080 or 1080ti, can train ~100 epochs of ResNet-50 model in a day. Obviously, training on a 2-GPU machine would be even faster. If that is what you mean by a normal computer, the answer is yes.
But since your dataset is very small, there's a big chance of overfitting. This looks like the biggest issue that your approach faces.