Yolov3 not starting training - object-detection

I am trying to train custom data set that consists of currency. i followed a youtube tutorial, made the same folder structure.
I am using google colab for free gpu and darknet. everytime i run data for training it finishes within seconds without any error and the final output says "608 x 608 create 6 permanent cpu threads"
the tutorial i followed shows the training of dataset but mine is keep getting stuck at this message.
I'm using yolov3 to train my dataset, followed every step of changing things in makefile. Also the train.txt and test.txt files stays empty too. (sorry for my bad english)
Below attached screenshot of the message i get when i try to train my model.

SOLVED : the issue was my train.txt file was empty because it wasn’t getting any image paths, soo i changed absolute path of my images folder to relative path and it saved all the images paths in train.txt file which resulted in activation of data training(sorry for my bad english)

Related

CNTK - Faster RCNN Train with My Own Labels Data Set Can not Train More Than 20 Images

I'm working with CNTK Faster RCNN object detection and now I have been facing with problem.
To make you understand the problem, I will start with explain my work process from started.
First I follow by https://learn.microsoft.com/en-us/cognitive-toolkit/object-detection-using-faster-r-cnn
to install all of need package. I successful in the step. Then I try with grocery data set which is contain 20 images train (I'm using base model as AlexNet).
And the results is done. everything look work at this point.
Then I use VoTT to labels my dataset and I put it into data set folder of CNTK. I also use annotations_helper.py to generate other input files for prepare model training step.
After I create My_DataSet_config.py and change some configuration. I realize that I can not train my data set more than 20 image. Let's say if I train 30 images programs will error like gt_boxes is empty (it's really empty but with some specific images training number it's no longer empty).
So I try to follow some instruction I found on GitHub like the problem is image and annotation files, try to delete the image and run again.
I really done that but it's not solution on my case. If the number of data set for train still not 20 images, I will find the error again with any image. Please take a look. Thank you
Python 3.5
Windows
CNTK 2.7
Here is my data set configuration file.
enter image description here
Here is my model configuration file.
enter image description here

TF Model Maker 'no such file or directory'

Im following a guide on using TFLite model maker using my own training data loaded from my google drive and keep getting the error "NotFoundError: /content/TFData/Train/img/img (73).jpg; No such file or directory" when trying to train with my data (please see below screenshot). Think i'm missing something obvious but cant seem to figure it out, apologies if this has been asked before, i'm somewhat new to working in this environment.
I have tried renaming all the images and maps and reshuffled the directory format to no avail.

Yolov5 object detection training

Please i need you help concerning my yolov5 training process for object detection!
I try to train my object detection model yolov5 for detecting small object ( scratch). For labelling my images i used roboflow, where i applied some data augmentation and some pre-processing that roboflow offers as a services. when i finish the pre-processing step and the data augmentation roboflow gives the choice for different output format, in my case it is yolov5 pytorch, and roboflow does everything for me splitting the data into training validation and test. Hence, Everything was set up as it should be for my data preparation and i got at the end the folder with data.yaml and the images with its labels, in data.yaml i put the path of my training and validation sets as i saw in the GitHub tutorial for yolov5. I followed the steps very carefully tought.
The problem is when the training start i get nan in the obj and box column as you can see in the picture bellow, that i don't know the reason why, can someone relate to that or give me any clue to find the solution please, it's my first project in computer vision.
This is what i get when the training process starts
This the last message error when the training finish
I think the problem comes maybe from here but i don't know how to fix it, i used the code of yolov5 team as it's in the tuto
The training continue without any problem but the map and precision remains 0 all the process !!
Ps : Here is the link of tuto i followed : https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data
This is what I would do to troubleshoot it. - Run your code on collab because the environment is proven to work well - Confirm that your labels look good and are setup correctly. Can you checked to ensure the classes look right? In one of the screenshots it looks like you have no labels
Running my code in colab worked successfully and the resulats were good. I think that the problem was in my personnel laptop environment maybe the version of pytorch i was using '1.10.0+cu113', or something else ! If you have any advices to set up my environnement for yolov5 properly i would be happy to take from you guys. many Thanks again to #alexheat
I'm using Yolov5 for my custom dataset too. This problem might be due to the directory misplacement.
And using different version of Pytorch will not be a problem. Anyway you can try using the version they mentioned in 'requirements.txt'
It's better if you run
cd yolov5
pip3 install -r requirements.txt
Let me know if this helps.

Object Detection Few-Shot training with TensorflowLite

I am trying to create a mobile app that uses object detection to detect a specific type of object. To do this I am starting with the Tensorflow object detection example Android app, which uses TF2 and ssd_mobilenet_v1.
I'd like to try Few-Shot training (Colab link) so I started by replacing the example app's SSD Mobilenet v1 download with the Colab's output file model.tflite, however this causes the the app to crash with following error:
java.lang.IllegalStateException: This model does not contain associated files, and is not a Zip file.
at org.tensorflow.lite.support.metadata.MetadataExtractor.assertZipFile(MetadataExtractor.java:313)
at org.tensorflow.lite.support.metadata.MetadataExtractor.getAssociatedFile(MetadataExtractor.java:164)
at org.tensorflow.lite.examples.detection.tflite.TFLiteObjectDetectionAPIModel.create(TFLiteObjectDetectionAPIModel.java:126)
at org.tensorflow.lite.examples.detection.DetectorActivity.onPreviewSizeChosen(DetectorActivity.java:99)
I realize the Colab uses ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz - does this mean there are changes needed in the app code - or is there something more fundamentally wrong with my approach?
Update: I also tried the Lite output of the Colab tf2_image_retraining and got the same error.
The fix apparently was https://github.com/tensorflow/examples/compare/master...cachvico:darren/fix-od - .tflite files can now be zip files including the labels, but the example app doesn't work with the old format.
This doesn't throw error when using the Few Shots colab output. Although I'm not getting results yet - pointing the app at pictures of rubber ducks not yet work.

Huge size of TF records file to store on Google Cloud

I am trying to modify a tensorflow project so that it becomes compatible with TPU.
For this, I started with the code explained on this site.
Here COCO dataset is downloaded and first its features are extracted using InceptionV3 model.
I wanted to modify this code so that it supports TPU.
For this, I added the mandatory code for TPU as per this link.
Withe TPU strategy scope, I created the InceptionV3 model using keras library and loaded model with ImageNet weights as per existing code.
Now, since TPU needs data to be stored on Google Cloud storage, I created a tf records file using tf.Example with the help of this link.
Now, I tried to create this file in several ways so that it will have the data that TPU will find through TFRecordDataset.
At first I directly added image data and image path to the file and uploaded it to GCP bucket but while reading this data, I realized that this image data is not useful as it does not contain shape/size information which it will need and I had not resized it to the required dimension before storage. This file size became 2.5GB which was okay.
Then I thought lets only keep image path at cloud, so I created another tf records file with only image path, then I thought that this may not be an optimized code as TPU will have to open the image individually resize it to 299,299 and then feed to model and it will be better if I have image data through .map() function inside TFRecordDataset, so I again tried, this time by using this link, by storing R, G and B along with image path inside tf records file.
However, now I see that the size of tf records file is abnormally large, like some 40-45GB and ultimately, I stopped the execution as my memory was getting filled up on Google Colab TPU.
The original size of COCO dataset is not that large. It almost like 13GB.. and from that the dataset is being created with only first 30,000 records. so 40GB looks weird number.
May I know what is the problem with this way of feature storage? Is there any better way to store image data in TF records file and then extract through TFRecordDataset.
I think the COCO dataset processed as TFRecords should be around 24-25 GB on GCS. Note that TFRecords aren't meant to act as a form of compression, they represent data as protobufs so it can be optimally loaded into TensorFlow programs.
You might have more success if you refer to: https://cloud.google.com/tpu/docs/coco-setup (corresponding script can be found here) for converting COCO (or a subset) into TFRecords.
Furthermore, we have implemented detection models for COCO using TF2/Keras optimized for GPU/TPU here which you might find useful for optimal input pipelines. An example tutorial can be found here. Thanks!