fine_tune faster_resnet101_coco by GTX 1080

fine_tune faster_resnet101_coco by GTX 1080 - tensorflow

Is it possible to fine-tuning faster_rcnn_resnet101_coco by GTX 1080 with object detection api? Or faster_rcnn_nasnet.

I'm not sure how much VRAM a 1080 has, but you can train a faster rcnn resnet 101 model on a 1080ti with 11GB RAM. Eyeballing the GPU usage there it should roughly fit 8GB with batch size 1, so I would say yes, you can finetune a RCNN resnet101 object detector using the object detection api.

Related

Our YOLOv4-tiny suddenly loses accuracy

Im training yolov4 tiny custom dataset, and suddenly loss and other markers drop to -nan
As you can see on the chart, all progress is lost after some iterations (around 800 iterations).
Yolov4 accuracy chart
Training log for given chart:
Darknet training log
Any ideas on given problem? It is running on ubuntu with 4 x GeForce GTX 1080 6GB.
When testing the same network on PC with single GeForce GTX 1060 6GB, it does not crash.

Google Vertex AI GPU only 50% utilized

I am running a custom training job using Google Vertex AI. I am using Nvidia Tesla V100 with 2 accelerators. I am training a ML model but my GPU utilization is only 50% during training.
I am using Nvidia Transfer Learning Toolkit to train an object detection model, and I specified GPUs=2 on the TLT commands.
Any ideas how I can get higher GPU utilization?

What is the fastest Mask R-CNN implementation available

I'm running a Mask R-CNN model on an edge device (with an NVIDIA GTX 1080). I am currently using the Detectron2 Mask R-CNN implementation and I archieve an inference speed of around 5 FPS.
To speed this up I looked at other inference engines and model implementations. For example ONNX, but I'm not able to gain a faster inference speed.
TensorRT looks very promising to me but I did not found a ready "out-of-the-box" implementation for it.
Are there any other mature and fast inference engines or other techniques to speed up the inference?

It's almost impossible to get higher inference speed for Mask R-CNN on GTX 1080. You may check detectron2 by Facebook AI Research.
Otherwise, I'd suggest to use YOLACT - (You Only Look At CoefficienTs), it can achieve real-time instance segmentation.
On the other hand, if you don't need instance segmentation, you can use YOLO, SSD, etc for object detection.

OpenCV 4.5.0 with DNN_BACKEND_CUDA and DNN_TARGET_CUDA/DNN_TARGET_CUDA_FP16.
Mask RCNN with 1024 x 1024 input image
Device | FPS
------------------ | -------
GTX 1080 Ti (FP32) | 29
RTX 2080 Ti (FP16) | 60
FPS measured includes NMS but excludes other preprocessing and postprocessing. The network fully runs end-to-end on GPU.
Benchmark code: https://gist.github.com/YashasSamaga/48bdb167303e10f4d07b754888ddbdcf

As #kkHarshit already mentioned it is very hard to speed up a Mask R-CNN any further.
The fastest instance segmentation model that I found is YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS).
It's perfomance is worse than Mask R-CNN or Yolact even but still very good.

How long does it take to train over the fashion-MNIST database?

I'm new to deep learning. I wanted to build an image classifier using CNN to classify clothing images. I decided to train over the fashion MNIST-dataset which is a dataset of 60,000 images. But I'm aware that training is a very heavy task.
I wanted to know how long will my PC take to train over this dataset and should I go for pre-trained models instead with a compromise of accuracy.
My PC configurations are:
- Intel Core i5-6400 CPU # 2.70 GHz
- 8GB RAM.
- NVIDIA GeForce GTX 1050 Ti.

Even though it depends on data-set size & number of EPOCS(i tried with 50 Epocs) ,here it is small that is 32x32.
So for me when i tried on a machine with
Intel Core i7-6400 CPU # 2.70 GHz
8GB RAM.
NVIDIA GeForce GTX 1050 Ti.
with image size(28x28) as provided in MNIST dataset in Tensorflow.org it took less than 5 minutes.

Why does my TensorFlow profiling timeline show idle times on GPU?

I was profiling the inference latency of a MobileNetV2 model (with a batch size of 20) on my GeForce GTX 1080 GPU.
The TensorFlow timeline shows as follows:
I notice that there is quite much empty space in the "stream: all Compute" line, which I think means my GPU was not always busy. What do you think could have been causing this idle time and are there any ways to improve it?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

fine_tune faster_resnet101_coco by GTX 1080 - tensorflow

Is it possible to fine-tuning faster_rcnn_resnet101_coco by GTX 1080 with object detection api? Or faster_rcnn_nasnet.

I'm not sure how much VRAM a 1080 has, but you can train a faster rcnn resnet 101 model on a 1080ti with 11GB RAM. Eyeballing the GPU usage there it should roughly fit 8GB with batch size 1, so I would say yes, you can finetune a RCNN resnet101 object detector using the object detection api.

Related

Our YOLOv4-tiny suddenly loses accuracy

Google Vertex AI GPU only 50% utilized

What is the fastest Mask R-CNN implementation available

How long does it take to train over the fashion-MNIST database?

Why does my TensorFlow profiling timeline show idle times on GPU?

Categories

Resources